Mellanox performance degradation with more than 12 lcores
Дмитрий Степанов
stepanov.dmit at gmail.com
Fri Feb 18 17:14:08 CET 2022
Thanks for the clarification!
I was able to get 148Mpps with 12 lcores after some BIOS tunings.
Looks like due to these HW limitations I have to use ring buffer as you
suggested to support more than 32 lcores!
пт, 18 февр. 2022 г. в 16:40, Dmitry Kozlyuk <dkozlyuk at nvidia.com>:
> Hi,
>
> > With more than 12 lcores overall receive performance reduces.
> > With 16-32 lcores I get 100-110 Mpps,
>
> It is more about the number of queues than the number of cores:
> 12 queues are the threshold when Multi-Packet Receive Queue (MPRQ)
> is automatically enabled in mlx5 PMD.
> Try increasing --rxd and check out mprq_en device argument.
> Please see mlx5 PMD user guide for details about MPRQ.
> You should be able to get full 148 Mpps with your HW.
>
> > and I get a significant performance fall with 33 lcores - 84Mpps.
> > With 63 cores I get even 35Mpps overall receive performance.
> >
> > Are there any limitations on the total number of receive queues (total
> > lcores) that can handle a single port on a given NIC?
>
> This is a hardware limitation.
> The limit on the number of queues you can create is very high (16M),
> but performance can perfectly scale only up to 32 queues
> at high packet rates (as opposed to bit rates).
> Using more queues can even degrade it, just as you observe.
> One way to overcome this (not specific to mlx5)
> is to use a ring buffer for incoming packets,
> from which any number of processing cores can take packets.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/users/attachments/20220218/ca7e5f55/attachment-0001.htm>
More information about the users
mailing list