[dpdk-users] rte_flow / hw-offloading is degrading performance when testing @ 100G
Arvind Narayanan
webguru2688 at gmail.com
Fri Mar 1 03:57:04 CET 2019
On Thu, Feb 28, 2019, 8:23 PM Cliff Burdick <shaklee3 at gmail.com> wrote:
> What size packets are you using? I've only steered to 2 rx queues by IP
> dst match, and was able to hit 100Gbps. That's with a 4KB jumboframe.
>
64 bytes. Agreed this is small, what seems interesting is l3fwd is able to
handle 64B but rte_flow suffers (a lot) - suggesting offloading is
expensive?!
I'm doing something similar, steering to different queues based off dst_ip.
However, my tests have around 80 rules, each rule steering to one of the 20
rx_queues. I have a one-to-one rx_queue-to-core_id mapping.
Arvind
> On Thu, Feb 28, 2019, 17:42 Arvind Narayanan <webguru2688 at gmail.com>
> wrote:
>
>> Hi,
>>
>> I am using DPDK 18.11 on Ubuntu 18.04, with Mellanox Connect X-5 100G
>> EN (MLNX_OFED_LINUX-4.5-1.0.1.0-ubuntu18.04-x86_64).
>> Packet generator: t-rex 2.49 running on another machine.
>>
>> I am able to achieve 100G line rate with l3fwd application (fr sz 64B)
>> using the parameters suggested in their performance report.
>> (
>> https://fast.dpdk.org/doc/perf/DPDK_18_11_Mellanox_NIC_performance_report.pdf
>> )
>>
>> However, as soon as I install rte_flow rules to steer packets to
>> different queues and/or use rte_flow's mark action, the throughput
>> reduces to ~41G. I also modified DPDK's flow_filtering example
>> application, and am getting the same reduced throughput of around 41G
>> out of 100G. But without rte_flow, it goes to 100G.
>>
>> I didn't change any OS/Kernel parameters to test l3fwd or the
>> application that uses rte_flow. I also ensure the application is
>> numa-aware and use 20 cores to handle 100G traffic.
>>
>> Upon further investigation (using Mellanox NIC counters), the drop in
>> throughput is due to mbuf allocation errors.
>>
>> Is such performance degradation normal when performing hw-acceleration
>> using rte_flow?
>> Has anyone tested throughput performance using rte_flow @ 100G?
>>
>> Its surprising to see hardware offloading is degrading the
>> performance, unless I am doing something wrong.
>>
>> Thanks,
>> Arvind
>>
>
More information about the users
mailing list