[dpdk-dev] [PATCH v6 4/5] net/af_xdp: use mbuf mempool for buffer management

Zhang, Qi Z qi.z.zhang at intel.com
Mon Apr 1 07:47:29 CEST 2019
Previous message: [dpdk-dev] [PATCH v7 0/8] Support vector instructions on ICE
Next message: [dpdk-dev] [PATCH v3] net/ixgbe: enable 10Mb/s link setup for x553
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> -----Original Message-----
> From: Ye, Xiaolong
> Sent: Sunday, March 31, 2019 8:38 PM
> To: Olivier Matz <olivier.matz at 6wind.com>
> Cc: dev at dpdk.org; David Marchand <david.marchand at redhat.com>; Andrew
> Rybchenko <arybchenko at solarflare.com>; Zhang, Qi Z <qi.z.zhang at intel.com>;
> Karlsson, Magnus <magnus.karlsson at intel.com>; Topel, Bjorn
> <bjorn.topel at intel.com>; Maxime Coquelin <maxime.coquelin at redhat.com>;
> Stephen Hemminger <stephen at networkplumber.org>; Yigit, Ferruh
> <ferruh.yigit at intel.com>; Luca Boccassi <bluca at debian.org>; Richardson, Bruce
> <bruce.richardson at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>
> Subject: Re: [dpdk-dev] [PATCH v6 4/5] net/af_xdp: use mbuf mempool for
> buffer management
> 
> Hi, Olivier
> 
> Thanks for the comments.
> 
> On 03/29, Olivier Matz wrote:
> >Hi Xiaolong,
> >
> >On Tue, Mar 26, 2019 at 08:20:28PM +0800, Xiaolong Ye wrote:
> >> Now, af_xdp registered memory buffer is managed by rte_mempool. mbuf
> >> allocated from rte_mempool can be converted to xdp_desc's address and
> >> vice versa.
> >>
> >> Signed-off-by: Xiaolong Ye <xiaolong.ye at intel.com>
> >> ---
> >>  drivers/net/af_xdp/rte_eth_af_xdp.c | 117
> >> +++++++++++++++++-----------
> >>  1 file changed, 72 insertions(+), 45 deletions(-)
> >>
> >> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> index 47a496ed7..a1fda9212 100644
> >> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> @@ -48,7 +48,11 @@ static int af_xdp_logtype;
> >>
> >>  #define ETH_AF_XDP_FRAME_SIZE
> 	XSK_UMEM__DEFAULT_FRAME_SIZE
> >>  #define ETH_AF_XDP_NUM_BUFFERS		4096
> >> -#define ETH_AF_XDP_DATA_HEADROOM	0
> >> +/* mempool hdrobj size (64 bytes) + sizeof(struct rte_mbuf) (128 bytes) */
> >> +#define ETH_AF_XDP_MBUF_OVERHEAD	192
> >> +/* data start from offset 320 (192 + 128) bytes */
> >> +#define ETH_AF_XDP_DATA_HEADROOM				\
> >> +	(ETH_AF_XDP_MBUF_OVERHEAD + RTE_PKTMBUF_HEADROOM)
> >
> >Having these constants looks quite dangerous too me. It imposes the
> >size of the mbuf, and the mempool header size. It would at least
> >require compilation checks.
> >
> >[...]
> >
> >> +	umem->mb_pool =
> rte_pktmbuf_pool_create_with_flags("af_xdp_mempool",
> >> +			ETH_AF_XDP_NUM_BUFFERS,
> >> +			250, 0,
> >> +			ETH_AF_XDP_FRAME_SIZE -
> >> +			ETH_AF_XDP_MBUF_OVERHEAD,
> >> +			MEMPOOL_F_NO_SPREAD |
> MEMPOOL_CHUNK_F_PAGE_ALIGN,
> >> +			SOCKET_ID_ANY);
> >> +	if (umem->mb_pool == NULL || umem->mb_pool->nb_mem_chunks != 1)
> {
> >> +		AF_XDP_LOG(ERR, "Failed to create mempool\n");
> >>  		goto err;
> >>  	}
> >> +	base_addr = (void *)get_base_addr(umem->mb_pool);
> >
> >Creating a mempool in the pmd like this does not look good to me for
> >several reasons:
> >- the user application creates its mempool with a specific private
> >  area in its mbufs. Here there is no private area, so it will break
> >  applications doing this.
> >- in patch 3 (mempool), you ensure that the chunk starts at a
> >  page-aligned address, and you expect that given the other flags and
> >  the constants at the top of the file, the data will be aligned. In
> >  my opinion it is not ideal.
> >- the user application may create a large number of mbufs, for instance
> >  if the application manages large reassembly queues, or tcp sockets.
> >  Here the driver limits the number of mbufs to 4k per rx queue.
> >- the socket_id is any, so it won't be efficient on numa architectures.
> 
> Our mbuf/mempool change regarding to zero copy does have limitations.

Just want to clarify, the above code is only reached for non-zero copy case.
here we create a private memory pool be used to manage AF_XDP umem, it's not the Rx queues' s memory pool itself.
so I don't think there is concern on the private area and 4k per rx queue for above code.
 
> 
> >
> >May I suggest another idea?
> >
> >Allocate the xsk_umem almost like in patch 1, but using rte_memzone
> >allocation instead of posix_memalign() (and it will be faster, because
> >it will use hugepages). And do not allocate any mempool in the driver.
> >
> 
> rte_memzone_reserve_aligned is better than posix_memalign, I'll use it in my
> first patch.
> 
> >When you receive a packet in the xsk_umem, allocate a new mbuf from the
> >standard pool. Then, use rte_pktmbuf_attach_extbuf() to attach the xsk
> >memory to the mbuf. You will have to register a callback to return the
> >xsk memory when the mbuf is transmitted or freed.
> 
> I'll try to investigate how to implement it.
> 
> >
> >This is, by the way, something I don't understand in your current
> >implementation: what happens if a mbuf is received in the af_xdp
> >driver, and freed by the application? How does the xsk buffer is returned?
> 
> It is coordinated by the fill ring. The fill ring is used by the application ( user space)
> to send down addr for the kernel to fill in with Rx packet data.
> So for the free side, we just return it to the mempool, and each time in
> rx_pkt_burst, we would allocate new mbufs and submit corresponding addrs to
> fill ring, that's how we return the xsk buffer to kernel.
> 
> >
> >Using rte_pktmbuf_attach_extbuf() would remove changes in mbuf and
> >mempool, at the price of (maybe) decreasing the performance. But I think
> >there are some places where it can be optimized.
> >
> >I understand my feedback comes late -- as usual :( -- but if you are in
> 
> Sorry for not Cc you in my patch set.
> 
> >a hurry for RC1, maybe we can consider to put the 1st patch only, and
> >add the zero-copy mode in a second patch later. What do you think?
> 
> Sounds a sensible plan, I'll try to exteranl mbuf buffer scheme first.
> 
> 
> Thanks,
> Xiaolong
> 
> >
> >Regards,
> >Olivier
> >
> >
Previous message: [dpdk-dev] [PATCH v7 0/8] Support vector instructions on ICE
Next message: [dpdk-dev] [PATCH v3] net/ixgbe: enable 10Mb/s link setup for x553
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list