Mellanox performance degradation with more than 12 lcores
Дмитрий Степанов
stepanov.dmit at gmail.com
Fri Feb 18 14:49:21 CET 2022
I get 125 Mpps from single port using 12 lcores:
numactl -N 1 -m 1 /opt/dpdk-21.11/build/app/dpdk-testpmd -l 64-127 -n 4 -a
0000:c1:00.0 -- --stats-period 1 --nb-cores=12 --rxq=12 --txq=12 --rxd=512
With 63 cores i get 35 Mpps:
numactl -N 1 -m 1 /opt/dpdk-21.11/build/app/dpdk-testpmd -l 64-127 -n 4 -a
0000:c1:00.0 -- --stats-period 1 --nb-cores=63 --rxq=63 --txq=63 --rxd=512
I'm using this guide as a reference -
https://fast.dpdk.org/doc/perf/DPDK_20_11_Mellanox_NIC_performance_report.pdf
This reference suggests examples of how to get the best performance but all
of them use maximum 12 lcores.
125 Mpps with 12 lcores is nearly the maximum I can get from single 100GB
port (148Mpps theoretical maximum for 64byte packet). I just want to
understand - why I get good performance with 12 lcores and bad performance
with 63 cores?
пт, 18 февр. 2022 г. в 16:30, Asaf Penso <asafp at nvidia.com>:
> Hello Dmitry,
>
> Could you please paste the testpmd commands per each experiment?
>
> Also, have you looked into dpdk.org performance report to see how to tune
> for best results?
>
> Regards,
> Asaf Penso
> ------------------------------
> *From:* Дмитрий Степанов <stepanov.dmit at gmail.com>
> *Sent:* Friday, February 18, 2022 9:32:59 AM
> *To:* users at dpdk.org <users at dpdk.org>
> *Subject:* Mellanox performance degradation with more than 12 lcores
>
> Hi folks!
>
> I'm using Mellanox ConnectX-6 Dx EN adapter card (100GbE; Dual-port
> QSFP56; PCIe 4.0/3.0 x16) with DPDK 21.11 on a server with AMD EPYC 7702
> 64-Core Processor (NUMA system with 2 sockets). Hyperthreading is turned
> off.
> I'm testing the maximum receive throughput I can get from a single port
> using testpmd utility (shipped with dpdk). My generator produces random UDP
> packets with zero payload length.
>
> I get the maximum performance using 8-12 lcores (overall 120-125Mpps on
> receive path of single port):
>
> numactl -N 1 -m 1 /opt/dpdk-21.11/build/app/dpdk-testpmd -l 64-127 -n 4
> -a 0000:c1:00.0 -- --stats-period 1 --nb-cores=12 --rxq=12 --txq=12
> --rxd=512
>
> With more than 12 lcores overall receive performance reduces. With 16-32
> lcores I get 100-110 Mpps, and I get a significant performance fall with 33
> lcores - 84Mpps. With 63 cores I get even 35Mpps overall receive
> performance.
>
> Are there any limitations on the total number of receive queues (total
> lcores) that can handle a single port on a given NIC?
>
> Thanks,
> Dmitriy Stepanov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/users/attachments/20220218/4879611b/attachment-0001.htm>
More information about the users
mailing list