[dpdk-users] HOW performance to run DPDK at ARM64 arch?

Pierre Laurent pierre.laurent at emutex.com
Thu Dec 27 17:41:57 CET 2018


Hi,

Regarding your question 2, the TX+Rx  numbers you get look strangely like you are trying to use full duplex traffic on a PCIe x4

The real bandwidth needed by an interface is approximately   ((pkt size + 48) * pps)   .

48 bytes is the approximate little overhead , per packet, for NIC descriptors and PCIe overheads. This is an undocumented heuristic ......

I guess you are using the default DPDK options, the ethernet FCS is not in PCIe bandwidth (stripped by the NIC on rx, generated by the NIC on TX). Same for 20 bytes ethernet preamble.


If I assume you are using 60 bytes packets .   ( 60 + 48 ) * (14 + 6) * 8 = approx 17 Gbps == more or less the bandwidth of a bidirectional x4 interface.


Tools like "lspci" and "dmidecode" will help you to investigate the real capabilities of the PCIe slots where your 82599 cards are plugged in.

The output of dmidecode looks like the following example, and x2, x4, x8, x16 indicate the number of lanes an interface will be able to use. The more lanes, the fastest.

System Slot Information
    Designation: System Slot 1
    Type: x8 PCI Express
    Current Usage: Available
    Length: Long
    ID: 1
    Characteristics:
        3.3 V is provided


To use a 82599 at full bidirectional rate, you need at least a x8 interface (1 port) or x16 interface (2 ports)

Regards,

Pierre


On 27/12/2018 09:24, 金塔尖之鑫 wrote:

recently, I have debug DPDK18.08 at my arm64 arch machine, with DPDK-pktgen3.5.2.
but the performace is very low for bidirectional traffic with x86 machine

here is my data:
hardware Conditions:
        arm64:    CPU - 64 cores, CPUfreq: 1.5GHz
                       MEM - 64 GiB
                       NIC - 82599ES dual port
        x86:        CPU - 4 cores, CPUfreq: 3.2GHz
                       MEM - 4GiB
                       NIC - 82599ES dual port
software Conditions:
         system kernel:
                  arm64: linux-4.4.58
                  x86: ubuntu16.04-4.4.0-generic
         tools:
                  DPDK18.08, DPDK-pktgen3.5.2

test:
       |----------|                bi-directional                |-----------|
       | arm64 | port0 |           < - >            | port0 |     x86   |
       |----------|                                                     |----------|

result
                                  arm64                                   x86
Pkts/s (Rx/Tx)            10.2/6.0Mpps                       6.0/14.80Mpps
MBits/s(Rx/Tx)             7000/3300 MBits/s             3300/9989 MBits/s

Questions:
1、Why DPDK data performance would be so much worse than the x86 architecture in arm64 addition?
2、above, Tx direction is not run full, Why Rx and TX affect each other?






More information about the users mailing list