[dpdk-dev] [PATCH v6 4/5] net/af_xdp: use mbuf mempool for buffer management

Olivier Matz olivier.matz at 6wind.com
Fri Mar 29 18:42:56 CET 2019


Hi Xiaolong,

On Tue, Mar 26, 2019 at 08:20:28PM +0800, Xiaolong Ye wrote:
> Now, af_xdp registered memory buffer is managed by rte_mempool. mbuf
> allocated from rte_mempool can be converted to xdp_desc's address and
> vice versa.
> 
> Signed-off-by: Xiaolong Ye <xiaolong.ye at intel.com>
> ---
>  drivers/net/af_xdp/rte_eth_af_xdp.c | 117 +++++++++++++++++-----------
>  1 file changed, 72 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 47a496ed7..a1fda9212 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -48,7 +48,11 @@ static int af_xdp_logtype;
>  
>  #define ETH_AF_XDP_FRAME_SIZE		XSK_UMEM__DEFAULT_FRAME_SIZE
>  #define ETH_AF_XDP_NUM_BUFFERS		4096
> -#define ETH_AF_XDP_DATA_HEADROOM	0
> +/* mempool hdrobj size (64 bytes) + sizeof(struct rte_mbuf) (128 bytes) */
> +#define ETH_AF_XDP_MBUF_OVERHEAD	192
> +/* data start from offset 320 (192 + 128) bytes */
> +#define ETH_AF_XDP_DATA_HEADROOM				\
> +	(ETH_AF_XDP_MBUF_OVERHEAD + RTE_PKTMBUF_HEADROOM)

Having these constants looks quite dangerous too me. It imposes the size
of the mbuf, and the mempool header size. It would at least require
compilation checks.

[...]

> +	umem->mb_pool = rte_pktmbuf_pool_create_with_flags("af_xdp_mempool",
> +			ETH_AF_XDP_NUM_BUFFERS,
> +			250, 0,
> +			ETH_AF_XDP_FRAME_SIZE -
> +			ETH_AF_XDP_MBUF_OVERHEAD,
> +			MEMPOOL_F_NO_SPREAD | MEMPOOL_CHUNK_F_PAGE_ALIGN,
> +			SOCKET_ID_ANY);
> +	if (umem->mb_pool == NULL || umem->mb_pool->nb_mem_chunks != 1) {
> +		AF_XDP_LOG(ERR, "Failed to create mempool\n");
>  		goto err;
>  	}
> +	base_addr = (void *)get_base_addr(umem->mb_pool);

Creating a mempool in the pmd like this does not look good to me for
several reasons:
- the user application creates its mempool with a specific private
  area in its mbufs. Here there is no private area, so it will break
  applications doing this.
- in patch 3 (mempool), you ensure that the chunk starts at a
  page-aligned address, and you expect that given the other flags and
  the constants at the top of the file, the data will be aligned. In
  my opinion it is not ideal.
- the user application may create a large number of mbufs, for instance
  if the application manages large reassembly queues, or tcp sockets.
  Here the driver limits the number of mbufs to 4k per rx queue.
- the socket_id is any, so it won't be efficient on numa architectures.

May I suggest another idea?

Allocate the xsk_umem almost like in patch 1, but using rte_memzone
allocation instead of posix_memalign() (and it will be faster, because
it will use hugepages). And do not allocate any mempool in the driver.

When you receive a packet in the xsk_umem, allocate a new mbuf from
the standard pool. Then, use rte_pktmbuf_attach_extbuf() to attach the
xsk memory to the mbuf. You will have to register a callback to return
the xsk memory when the mbuf is transmitted or freed.

This is, by the way, something I don't understand in your current
implementation: what happens if a mbuf is received in the af_xdp driver,
and freed by the application? How does the xsk buffer is returned?

Using rte_pktmbuf_attach_extbuf() would remove changes in mbuf and
mempool, at the price of (maybe) decreasing the performance. But I think
there are some places where it can be optimized.

I understand my feedback comes late -- as usual :( -- but if you are in
a hurry for RC1, maybe we can consider to put the 1st patch only, and
add the zero-copy mode in a second patch later. What do you think?

Regards,
Olivier




More information about the dev mailing list