[dpdk-dev] [PATCHv8 19/46] pool/dpaa2: add DPAA2 hardware offloaded mempool

Hemant Agrawal hemant.agrawal at nxp.com
Wed Mar 8 13:52:06 CET 2017


Hi Olivier,
	Thanks for your detailed review.  Please see inline...

On 3/8/2017 2:35 PM, Olivier MATZ wrote:
> Hi Hemant,
>
> On Fri, 3 Mar 2017 18:16:36 +0530, Hemant Agrawal
> <hemant.agrawal at nxp.com> wrote:
>> Adding NXP DPAA2 architecture specific mempool support.
>>
>> This patch also registers a dpaa2 type MEMPOOL OPS
>>
>> Signed-off-by: Hemant Agrawal <hemant.agrawal at nxp.com>
>> ---
>>  MAINTAINERS                                   |   1 +
>>  config/common_base                            |   5 +
>>  config/defconfig_arm64-dpaa2-linuxapp-gcc     |   8 +
>>  drivers/Makefile                              |   1 +
>>  drivers/pool/Makefile                         |  40 +++
>>  drivers/pool/dpaa2/Makefile                   |  72 ++++++
>>  drivers/pool/dpaa2/dpaa2_hw_mempool.c         | 339
>> ++++++++++++++++++++++++++
>> drivers/pool/dpaa2/dpaa2_hw_mempool.h         |  95 ++++++++
>> drivers/pool/dpaa2/rte_pool_dpaa2_version.map |   8 +
>
> I think the current mempool handlers should be moved first in a
> separate patch.
>

Are you seeing any benefit by making it a separate patch series?

it will be difficult and tricky for us. The dpaa2_pool has a dependency 
on mc bus patches. dpaa2_pmd has dependency on dpaa2_pool and mc buses.

This will mean that we have to split it into 3 patch series and it will 
become cumbersome to deal with 3 series.


> I'd prefer drivers/mempool instead of drivers/pool (more precise and
> more consistent with librte_mempool).
>

We will take care of it in next revision.

>
>>
>> [...]
>>
>> +
>> +struct dpaa2_bp_info rte_dpaa2_bpid_info[MAX_BPID];
>> +static struct dpaa2_bp_list *h_bp_list;
>> +
>> +static int
>> +hw_mbuf_create_pool(struct rte_mempool *mp)
>
> Would it work for something else than mbufs?
> The initial approach of the mempool is to work for kind of object. The
> specialization in mbuf is done by the mbuf layer.

I think, we did discuss that hw offloaded mempool are mainly for packet 
buffers/mbufs. Currently we only support mbuf type of objects.

Ideally a hw buffer pool can work for any kind mempool. However, it is 
not the best way to use hw buffer pools. The latency to allocate buffers 
are higher than software.  The main advantage SoCs, get by using hw pool 
is that they work seamlessly with the MAC layer.

>
>
>> +{
>> +	struct dpaa2_bp_list *bp_list;
>> +	struct dpaa2_dpbp_dev *avail_dpbp;
>> +	struct dpbp_attr dpbp_attr;
>> +	uint32_t bpid;
>> +	int ret;
>> +
>> +	avail_dpbp = dpaa2_alloc_dpbp_dev();
>> +
>> +	if (!avail_dpbp) {
>> +		PMD_DRV_LOG(ERR, "DPAA2 resources not available");
>> +		return -1;
>> +	}
>
> The other pool handlers return a -errno instead of -1. I think it
> should be the same here.

We will fix it.

>
> The same comment can applies to other locations/functions.
>
>> [...]
>> +
>> +	/* Set parameters of buffer pool list */
>> +	bp_list->buf_pool.num_bufs = mp->size;
>> +	bp_list->buf_pool.size = mp->elt_size
>> +			- sizeof(struct rte_mbuf) - rte_pktmbuf_priv_size(mp);
>> +	bp_list->buf_pool.bpid = dpbp_attr.bpid;
>> +	bp_list->buf_pool.h_bpool_mem = NULL;
>> +	bp_list->buf_pool.mp = mp;
>> +	bp_list->buf_pool.dpbp_node = avail_dpbp;
>> +	bp_list->next = h_bp_list;
>> +
>> +	bpid = dpbp_attr.bpid;
>> +
>> +
>> +	rte_dpaa2_bpid_info[bpid].meta_data_size = sizeof(struct rte_mbuf)
>> +				+ rte_pktmbuf_priv_size(mp);
>
> Are the 2 empty lines garbage?

we will fix it

>
>
>> +	rte_dpaa2_bpid_info[bpid].bp_list = bp_list;
>> +	rte_dpaa2_bpid_info[bpid].bpid = bpid;
>> +
>> +	mp->pool_data = (void *)&rte_dpaa2_bpid_info[bpid];
>> +
>> +	PMD_INIT_LOG(DEBUG, "BP List created for bpid =%d", dpbp_attr.bpid); +
>> +	h_bp_list = bp_list;
>> +	/* Identification for our offloaded pool_data structure
>> +	 */
>> +	mp->flags |= MEMPOOL_F_HW_PKT_POOL;
>
> I think this flag should be declared in rte_mempool.h,
> not in drivers/bus/fslmc/portal/dpaa2_hw_pvt.h.
>
> It should also be documented, what does this flag mean?

Currently we need a way to differentiate that this is a hw allocated pkt 
buffer pool or software based buffer pool. String comparison is costly.

This flag was discussed during the hw mempool patches of david. Not 
everyone was in favor of keeping it in librte_mempool.

So, we just hid it inside our offloaded mempool.

>
>> [...]
>>
>> +static
>> +void rte_dpaa2_mbuf_release(struct rte_mempool *pool __rte_unused,
>> +			void * const *obj_table,
>> +			uint32_t bpid,
>> +			uint32_t meta_data_size,
>> +			int count)
>
>
> Is there a reason why some functions are prefixed with rte_dpaa2_ and
> other but hw_mbuf_?

initial reason was to use rte_ only for exported functions. we can fix it.

>
>
>> +{
>> +	struct qbman_release_desc releasedesc;
>> +	struct qbman_swp *swp;
>> +	int ret;
>> +	int i, n;
>> +	uint64_t bufs[DPAA2_MBUF_MAX_ACQ_REL];
>> +
>> +	if (unlikely(!DPAA2_PER_LCORE_DPIO)) {
>> +		ret = dpaa2_affine_qbman_swp();
>> +		if (ret != 0) {
>> +			RTE_LOG(ERR, PMD, "Failed to allocate IO portal");
>> +			return;
>> +		}
>> +	}
>> +	swp = DPAA2_PER_LCORE_PORTAL;
>> +
>> +	/* Create a release descriptor required for releasing
>> +	 * buffers into QBMAN
>> +	 */
>> +	qbman_release_desc_clear(&releasedesc);
>> +	qbman_release_desc_set_bpid(&releasedesc, bpid);
>> +
>> +	n = count % DPAA2_MBUF_MAX_ACQ_REL;
>> +
>> +	/* convert mbuf to buffers  for the remainder*/
>
> bad spaces

ok

>
>> +	for (i = 0; i < n ; i++)
>> +		bufs[i] = (uint64_t)obj_table[i] + meta_data_size;
>> +
>> +	/* feed them to bman*/
>
> missing space at the end

ok

>
>> +	do {
>> +		ret = qbman_swp_release(swp, &releasedesc, bufs, n);
>> +	} while (ret == -EBUSY);
>> +
>> +	/* if there are more buffers to free */
>> +	while (n < count) {
>> +		/* convert mbuf to buffers */
>> +		for (i = 0; i < DPAA2_MBUF_MAX_ACQ_REL; i++)
>> +			bufs[i] = (uint64_t)obj_table[n + i] + meta_data_size;
>> +
>> +		do {
>> +			ret = qbman_swp_release(swp, &releasedesc, bufs,
>> +						DPAA2_MBUF_MAX_ACQ_REL);
>> +			} while (ret == -EBUSY);
>
> The while in not properly indented

ok

>
>> [...]
>> +int rte_dpaa2_mbuf_alloc_bulk(struct rte_mempool *pool,
>> +		       void **obj_table, unsigned int count)
>> +{
>> +#ifdef RTE_LIBRTE_DPAA2_DEBUG_DRIVER
>> +	static int alloc;
>> +#endif
>> +	struct qbman_swp *swp;
>> +	uint32_t mbuf_size;
>> +	uint16_t bpid;
>> +	uint64_t bufs[DPAA2_MBUF_MAX_ACQ_REL];
>> +	int i, ret;
>> +	unsigned int n = 0;
>> +	struct dpaa2_bp_info *bp_info;
>> +
>> +	bp_info = mempool_to_bpinfo(pool);
>> +
>> +	if (!(bp_info->bp_list)) {
>> +		RTE_LOG(ERR, PMD, "DPAA2 buffer pool not configured\n");
>> +		return -2;
>> +	}
>> +
>> +	bpid = bp_info->bpid;
>> +
>> +	if (unlikely(!DPAA2_PER_LCORE_DPIO)) {
>> +		ret = dpaa2_affine_qbman_swp();
>> +		if (ret != 0) {
>> +			RTE_LOG(ERR, PMD, "Failed to allocate IO portal");
>> +			return -1;
>> +		}
>> +	}
>> +	swp = DPAA2_PER_LCORE_PORTAL;
>> +
>> +	mbuf_size = sizeof(struct rte_mbuf) + rte_pktmbuf_priv_size(pool);
>> +	while (n < count) {
>> +		/* Acquire is all-or-nothing, so we drain in 7s,
>> +		 * then the remainder.
>> +		 */
>> +		if ((count - n) > DPAA2_MBUF_MAX_ACQ_REL) {
>> +			ret = qbman_swp_acquire(swp, bpid, bufs,
>> +					DPAA2_MBUF_MAX_ACQ_REL);
>> +		} else {
>> +			ret = qbman_swp_acquire(swp, bpid, bufs,
>> +						count - n);
>> +		}
>> +		/* In case of less than requested number of buffers available
>> +		 * in pool, qbman_swp_acquire returns 0
>> +		 */
>> +		if (ret <= 0) {
>> +			PMD_TX_LOG(ERR, "Buffer acquire failed with"
>> +				   " err code: %d", ret);
>> +			/* The API expect the exact number of requested bufs */
>> +			/* Releasing all buffers allocated */
>> +			rte_dpaa2_mbuf_release(pool, obj_table, bpid,
>> +					   bp_info->meta_data_size, n);
>> +			return -1;
>> +		}
>> +		/* assigning mbuf from the acquired objects */
>> +		for (i = 0; (i < ret) && bufs[i]; i++) {
>> +			/* TODO-errata - observed that bufs may be null
>> +			 * i.e. first buffer is valid,
>> +			 * remaining 6 buffers may be null
>> +			 */
>> +			obj_table[n] = (struct rte_mbuf *)(bufs[i] - mbuf_size);
>> +			rte_mbuf_refcnt_set((struct rte_mbuf *)obj_table[n], 0);
>
> I think we should not assume the objects are mbufs.

currently we are only supporting packet buffer i.e. mbufs

> But even if we do it, I don't see why we need to set the refcnt.

rte_mbuf_raw_alloc has check.
RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0);

So, mempool should have packets with refcnt as '0'.

In our case, during transmit the NIC releases the buffer back to the hw 
pool without core intervention. We have no option to reset specific 
fields of buffer in the hardware.

It can be set on per packet basis during transmission, but this will 
than add to the packet processing cost. In case of simple forwarding, 
NIC will directly get the packets from hw, it will release directly to 
hw. So, we tried to avoid this cost in data path.

>
> What is returned un buf[] table? In rte_dpaa2_mbuf_release(), it looks you
> are using buf[i] = obj_table[i] + bp_info->meta_data_size
>

meta_data_size = sizeof(struct rte_mbuf) +  rte_pktmbuf_priv_size(mp);

In the hardware, we are configure address as buf, which is obj_table + 
meta_data_size;

>
>> [...]
>> +
>> +static unsigned
>> +hw_mbuf_get_count(const struct rte_mempool *mp __rte_unused)
>> +{
>> +	return 0;
>
> Looks this is not implemented. This would give wrong mempool statistics
> when calling rte_mempool_dump().
>

Now our MC buf supports the counts, so we can implement it in next version.

> Adding a test for this handler in app/test may highlight these issues.

Yes! we know it fails. It was not supported in our bus - objects earlier.

Thanks for the suggestion. We have a plan to add these test cases. We 
will add it in our pending item list.

>
>
>> [...]
>> +
>> +#define DPAA2_MAX_BUF_POOLS	8
>> +
>> +struct buf_pool_cfg {
>> +	void *addr; /*!< The address from where DPAA2 will carve out the
>> +		     * buffers. 'addr' should be 'NULL' if user wants
>> +		     * to create buffers from the memory which user
>> +		     * asked DPAA2 to reserve during 'nadk init'
>> +		     */
>> +	phys_addr_t    phys_addr;  /*!< corresponding physical address
>> +				    * of the memory provided in addr
>> +				    */
>> +	uint32_t num; /*!< number of buffers */
>> +	uint32_t size; /*!< size of each buffer. 'size' should include
>> +			* any headroom to be reserved and alignment
>> +			*/
>> +	uint16_t align; /*!< Buffer alignment (in bytes) */
>> +	uint16_t bpid; /*!< The buffer pool id. This will be filled
>> +			*in by DPAA2 for each buffer pool
>> +			*/
>> +};
>
> I think the usual doxygen comment in dpdk is "/**" instead of "/*!"

yes! we will fix it.

>
>
> Regards,
> Olivier
>
Regards,
Hemant



More information about the dev mailing list