[dpdk-dev] [PATCH v2] net/mlx5: improve vMPRQ descriptors allocation locality

Slava Ovsiienko viacheslavo at nvidia.com
Tue Nov 10 17:30:10 CET 2020


> -----Original Message-----
> From: Alexander Kozyrev <akozyrev at nvidia.com>
> Sent: Sunday, November 8, 2020 6:24
> To: dev at dpdk.org
> Cc: Raslan Darawsheh <rasland at nvidia.com>; Matan Azrad
> <matan at nvidia.com>; Slava Ovsiienko <viacheslavo at nvidia.com>
> Subject: [PATCH v2] net/mlx5: improve vMPRQ descriptors allocation locality
> 
> There is a performance penalty for the replenish scheme used in vectorized Rx
> burst for both MPRQ and SPRQ.
> Mbuf elements are being filled at the end of the mbufs array and being
> replenished at the beginning. That leads to an increase in cache misses and the
> performance drop.
> The more Rx descriptors are used the worse the situation.
> 
> Change the allocation scheme for vectorized MPRQ Rx burst:
> allocate new mbufs only when consumed mbufs are almost depleted (always
> have one burst gap between allocated and consumed indices). Keeping a small
> number of mbufs allocated improves cache locality and improves performance
> a lot.
> 
> Unfortunately, this approach cannot be applied to SPRQ Rx burst routine. In
> MPRQ Rx burst we simply copy packets from external MPRQ buffers or attach
> these buffers to mbufs.
> In SPRQ Rx burst we allow the NIC to fill mbufs for us.
> Hence keeping a small number of allocated mbufs will limit NIC ability to fill as
> many buffers as possible. This fact offsets the advantage of better cache
> locality.
> 
> Fixes: 0f20acbf5e ("net/mlx5: implement vectorized MPRQ burst")
> 
> Signed-off-by: Alexander Kozyrev <akozyrev at nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>



More information about the dev mailing list