Mellanox performance degradation with more than 12 lcores
Asaf Penso
asafp at nvidia.com
Fri Feb 18 14:30:22 CET 2022
Hello Dmitry,
Could you please paste the testpmd commands per each experiment?
Also, have you looked into dpdk.org performance report to see how to tune for best results?
Regards,
Asaf Penso
________________________________
From: Дмитрий Степанов <stepanov.dmit at gmail.com>
Sent: Friday, February 18, 2022 9:32:59 AM
To: users at dpdk.org <users at dpdk.org>
Subject: Mellanox performance degradation with more than 12 lcores
Hi folks!
I'm using Mellanox ConnectX-6 Dx EN adapter card (100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16) with DPDK 21.11 on a server with AMD EPYC 7702 64-Core Processor (NUMA system with 2 sockets). Hyperthreading is turned off.
I'm testing the maximum receive throughput I can get from a single port using testpmd utility (shipped with dpdk). My generator produces random UDP packets with zero payload length.
I get the maximum performance using 8-12 lcores (overall 120-125Mpps on receive path of single port):
numactl -N 1 -m 1 /opt/dpdk-21.11/build/app/dpdk-testpmd -l 64-127 -n 4 -a 0000:c1:00.0 -- --stats-period 1 --nb-cores=12 --rxq=12 --txq=12 --rxd=512
With more than 12 lcores overall receive performance reduces. With 16-32 lcores I get 100-110 Mpps, and I get a significant performance fall with 33 lcores - 84Mpps. With 63 cores I get even 35Mpps overall receive performance.
Are there any limitations on the total number of receive queues (total lcores) that can handle a single port on a given NIC?
Thanks,
Dmitriy Stepanov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/users/attachments/20220218/1be40e68/attachment.htm>
More information about the users
mailing list