[dpdk-dev] DPDK testpmd forwarding performace degradation

Alexander Belyakov abelyako at gmail.com
Thu Feb 5 15:39:21 CET 2015


On Thu, Jan 29, 2015 at 3:43 PM, Alexander Belyakov <abelyako at gmail.com>
wrote:

>
>
> On Wed, Jan 28, 2015 at 3:24 PM, Alexander Belyakov <abelyako at gmail.com>
> wrote:
>
>>
>>
>> On Tue, Jan 27, 2015 at 7:21 PM, De Lara Guarch, Pablo <
>> pablo.de.lara.guarch at intel.com> wrote:
>>
>>>
>>>
>>> > On Tue, Jan 27, 2015 at 10:51 AM, Alexander Belyakov
>>>
>>> > <abelyako at gmail.com> wrote:
>>>
>>> >
>>>
>>> > Hi Pablo,
>>>
>>> >
>>>
>>> > On Mon, Jan 26, 2015 at 5:22 PM, De Lara Guarch, Pablo
>>>
>>> > <pablo.de.lara.guarch at intel.com> wrote:
>>>
>>> > Hi Alexander,
>>>
>>> >
>>>
>>> > > -----Original Message-----
>>>
>>> > > From: dev [mailto:dev-bounces at dpdk.org <dev-bounces at dpdk.org>] On
>>> Behalf Of Alexander
>>>
>>> > Belyakov
>>>
>>> > > Sent: Monday, January 26, 2015 10:18 AM
>>>
>>> > > To: dev at dpdk.org
>>>
>>> > > Subject: [dpdk-dev] DPDK testpmd forwarding performace degradation
>>>
>>> > >
>>>
>>> > > Hello,
>>>
>>> > >
>>>
>>> > > recently I have found a case of significant performance degradation
>>> for our
>>>
>>> > > application (built on top of DPDK, of course). Surprisingly, similar
>>> issue
>>>
>>> > > is easily reproduced with default testpmd.
>>>
>>> > >
>>>
>>> > > To show the case we need simple IPv4 UDP flood with variable UDP
>>>
>>> > payload
>>>
>>> > > size. Saying "packet length" below I mean: Eth header length (14
>>> bytes) +
>>>
>>> > > IPv4 header length (20 bytes) + UPD header length (8 bytes) + UDP
>>> payload
>>>
>>> > > length (variable) + CRC (4 bytes). Source IP addresses and ports are
>>>
>>> > selected
>>>
>>> > > randomly for each packet.
>>>
>>> > >
>>>
>>> > > I have used DPDK with revisions 1.6.0r2 and 1.7.1. Both show the same
>>>
>>> > issue.
>>>
>>> > >
>>>
>>> > > Follow "Quick start" guide (http://dpdk.org/doc/quick-start) to
>>> build and
>>>
>>> > > run testpmd. Enable testpmd forwarding ("start" command).
>>>
>>> > >
>>>
>>> > > Table below shows measured forwarding performance depending on
>>>
>>> > packet
>>>
>>> > > length:
>>>
>>> > >
>>>
>>> > > No. -- UDP payload length (bytes) -- Packet length (bytes) --
>>> Forwarding
>>>
>>> > > performance (Mpps) -- Expected theoretical performance (Mpps)
>>>
>>> > >
>>>
>>> > > 1. 0 -- 64 -- 14.8 -- 14.88
>>>
>>> > > 2. 34 -- 80 -- 12.4 -- 12.5
>>>
>>> > > 3. 35 -- 81 -- 6.2 -- 12.38 (!)
>>>
>>> > > 4. 40 -- 86 -- 6.6 -- 11.79
>>>
>>> > > 5. 49 -- 95 -- 7.6 -- 10.87
>>>
>>> > > 6. 50 -- 96 -- 10.7 -- 10.78 (!)
>>>
>>> > > 7. 60 -- 106 -- 9.4 -- 9.92
>>>
>>> > >
>>>
>>> > > At line number 3 we have added 1 byte of UDP payload (comparing to
>>>
>>> > > previous
>>>
>>> > > line) and got forwarding performance halved! 6.2 Mpps against 12.38
>>> Mpps
>>>
>>> > > of
>>>
>>> > > expected theoretical maximum for this packet size.
>>>
>>> > >
>>>
>>> > > That is the issue.
>>>
>>> > >
>>>
>>> > > Significant performance degradation exists up to 50 bytes of UDP
>>> payload
>>>
>>> > > (96 bytes packet length), where it jumps back to theoretical maximum.
>>>
>>> > >
>>>
>>> > > What is happening between 80 and 96 bytes packet length?
>>>
>>> > >
>>>
>>> > > This issue is stable and 100% reproducible. At this point I am not
>>> sure if
>>>
>>> > > it is DPDK or NIC issue. These tests have been performed on Intel(R)
>>> Eth
>>>
>>> > > Svr Bypass Adapter X520-LR2 (X520LR2BP).
>>>
>>> > >
>>>
>>> > > Is anyone aware of such strange behavior?
>>>
>>> > I cannot reproduce the issue using two ports on two different 82599EB
>>> NICs,
>>>
>>> > using 1.7.1 and 1.8.0.
>>>
>>> > I always get either same or better linerate as I increase the packet
>>> size.
>>>
>>> >
>>>
>>> > Thank you for trying to reproduce the issue.
>>>
>>> >
>>>
>>> > Actually, have you tried using 1.8.0?
>>>
>>> >
>>>
>>> > I feel 1.8.0 is little bit immature and might require some post-release
>>>
>>> > patching. Even tespmd from this release is not forwarding packets
>>> properly
>>>
>>> > on my setup. It is up and running without visible errors/warnings,
>>> TX/RX
>>>
>>> > counters are ticking but I can not see any packets at the output.
>>>
>>>
>>>
>>> This is strange. Without  changing anything, forwarding works perfectly
>>> for me
>>>
>>> (so, RTE_LIBRTE_IXGBE_RX_ALLOW_BULK_ALLOC is enabled).
>>>
>>>
>>>
>>> >Please note, both 1.6.0r2 and 1.7.1 releases work (on the same setup)
>>> out-of-the-box just
>>>
>>> > fine with only exception of this mysterious performance drop.
>>>
>>> > So it will take some time to figure out what is wrong with dpdk-1.8.0.
>>>
>>> > Meanwhile we could focus on stable dpdk-1.7.1.
>>>
>>> >
>>>
>>> > Managed to get testpmd from dpdk-1.8.0 to work on my setup.
>>>
>>> > Unfortunately I had to disable RTE_LIBRTE_IXGBE_RX_ALLOW_BULK_ALLOC,
>>>
>>> > it is new comparing to 1.7.1 and somehow breaks testpmd forwarding. By
>>> the
>>>
>>> > way, simply disabling RTE_LIBRTE_IXGBE_RX_ALLOW_BULK_ALLOC in
>>>
>>> > common_linuxapp config file breaks the build - had to make
>>> quick'n'dirty fix
>>>
>>> > in struct igb_rx_queue as well.
>>>
>>> >
>>>
>>> > Anyway, issue is still here.
>>>
>>> >
>>>
>>> > Forwarding 80 bytes packets at 12.4 Mpps.
>>>
>>> > Forwarding 81 bytes packets at 7.2 Mpps.
>>>
>>> >
>>>
>>> > Any ideas?
>>>
>>> > As for X520-LR2 NIC - it is dual port bypass adapter with device id
>>> 155d. I
>>>
>>> > believe it should be treated as 82599EB except bypass feature. I put
>>> bypass
>>>
>>> > mode to "normal" in those tests.
>>>
>>>
>>>
>>> I have used a 82599EB first, and now a X520-SR2. Same results.
>>>
>>> I assume that X520-SR2 and X520-LR2 should give similar results
>>>
>>> (only thing that is changed is the wavelength, but the controller is the
>>> same).
>>>
>>>
>>>
>> It seems I found what was wrong, at least got a hint.
>>
>> My build server machine type differs from test setup. Until now it was OK
>> to build DPDK with -march=native.
>>
>> I found that building dpdk-1.8.0 with explicitly set core-avx-i (snb,
>> ivb) or bdver2 (amd) machine types almost eliminates performance drop. The
>> same goes for RTE_LIBRTE_IXGBE_RX_ALLOW_BULK_ALLOC option issues.
>>
>> It seems DPDK performance and stability depends on machine type more than
>> I was expecting.
>>
>> Thank you for your help.
>>
>>
> Mysteries are still here.
>
> While single RX queue forwarding shows almost no degradation now, multiple
> RX queues still do.
>
> Launching
>
> ./testpmd -c7 -n3 -- -i --rxq=2 --txq=2 --nb-cores=2 --nb-ports=2
>
> shows the following result:
>
> 80 bytes packets are being forwarded at 12.46Mpps rate (which is fine)
> 81 bytes packets are being forwarded at 7.5Mpps rate (which is weird)
>
> Pablo, could you please check on your setup if multiple RX queues shows no
> performance degradation depending on packet size?
>
> Additional information about packets I'm sending.
> 80 bytes packet is IPv4 UDP packet (with random source IP/port) and 34
> payload bytes.
> 81 bytes packet is IPv4 UDP packet (with random source IP/port) and 35
> payload bytes.
>
> Thank you,
> Alexander
>
>
Just FYI.

I have run more tests on multiple hardware configurations. It seems issue
appears only with AMD based servers.

I have tried two HP ProLiant DL385p G8 servers. One with AMD Opteron(TM)
Processor 6272 and one with AMD Opteron(tm) Processor 6376). Also I have
used dual port Intel(R) Eth Svr Bypass Adapter X520-LR2 (X520LR2BP) and two
single port Intel(R) Eth Converged Network Adapter X520-LR1 (E10G41BFLRBLK)
adapters.

All configurations based on these hardware pieces show significant
forwarding performance degradation depending on packet size.

Please take a look at the following image link to understand what I am
talking about. This diagram is based on 1.6.0r2, but I have tried other
versions including 1.8.0 and issue remains.

https://drive.google.com/file/d/0B8wI4oKPHe0DaUNKdEdVdnUyRXc/view?usp=sharing

But this is not the end. As soon as I switched to HP ProLiant DL380p Gen8
with Intel(R) Xeon(R) CPU E5-2690 on board everything started to work
perfectly - forwarding performance matches theoretical expectations.

At the moment I have no idea what is wrong with AMD platforms (or what i am
doing wrong).

If someone needs more details on these experiments or knows how to fix
that, please let me know.

Regards,
Alexander


>
>
>> Alexander
>>
>>  Pablo
>>>
>>> > Alexander
>>>
>>> >
>>>
>>> >
>>>
>>> > Pablo
>>>
>>> > >
>>>
>>> > > Regards,
>>>
>>> > > Alexander Belyakov
>>>
>>> >
>>>
>>> >
>>>
>>>
>>>
>>>
>>>
>>
>>
>


More information about the dev mailing list