[dpdk-users] Issue with OpenStack SR-IOV performance when using poll-mode DPDK ixgbevf driver on DPDK 2.2

Ewan Stephens Ewan.Stephens at metaswitch.com
Wed Jul 20 13:15:24 CEST 2016


Hi There

I am experiencing issues with an application which utilizes the DPDK.

I'm trying to upgrade the version of the DPDK used by the application. Unfortunately, this is causing the application to perform worse than on previous versions when running on a KVM/OpenStack virtual machine.

Test Setup (Host):
Chassis - Dell R730
CPU - Haswell (Intel Xeon E5-2690 v3 @2.60 GHz)
NIC - 10GbE Intel Niantic NIC (Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01))
NIC driver - ixgbe
OS - Mirantis OpenStack 7.0 (OpenStack Kilo) on Ubuntu 14.04
Kernel - 3.13.0-73
Hypervisor - KVM

Test setup (guest):
Number of cores - 8
Networking - SR-IOV
NIC driver - ixgbevf
OS - Centos 6.6
Kernel -  2.6.32-504

Tests undertaken
Three tests were run with DPDK 1.6, 2.0 and DPDK 2.2, all other parameters were controlled. The maximum number of packets that the application could process was 17 million/s+ on DPDK 1.6 and 2.0 (we did not test higher than 17 million/s but suspect it would have maxed out between 20 and 25 million/s) but with DPDK 2.2 the test failed at 13 million/s. The test pass criteria was that less than 1 in 100,000 packets were dropped.

Notes:

*         The packets sent into the application were of roughly 100B in size each.

*         We do not experience a similar performance regression when testing non-virtualised (i.e. application running directly on a Dell R730 server without any virtualization) between DPDK 1.6 and 2.2.

Conclusion
As previously stated all other parameters in the tests were controlled so I conclude there is some kind of regression in the DPDK code between 2.0 and 2.2 that is causing the performance regression. This could be in the ixgbevf poll mode driver provided with DPDK or it could be in the DPDK code itself.

Work so far to debug issue
I've tried various things to debug the issue but with no success so far:

*         While running the testing we used the Linux utility "perf" on the guest to check what proportion of CPU cycles were being consumed by different functions on 1.6 and 2.2. This didn't show any significant differences between the two tests.

*         Took a look at the diffs between the ixgbevf driver code in 1.6 and 2.2. Nothing obviously suspicious.

*         Attempted to run the DPDK sample packet forwarder application on both 1.6 and 2.2 on an OpenStack VM. The results of this testing were unreliable. Packets dropped varied from test to test on both versions of DPDK.

Questions

*         Are there any known issues introduced between DPDK 2.0 and DPDK 2.2 which might cause this type of performance issue?

*         Has anyone else experienced a similar issue?

*         What further steps could we take to debug the issue?

o   Note that I'm reluctant to try upgrading to 16.04 unless there is a good reason to believe this will fix the issue as integrating new versions of DPDK with the application is a time consuming process.

I'd really appreciate some help on this as I'm pretty stumped right now.

Thanks for your help
Ewan



More information about the users mailing list