[dpdk-dev] DPDK L2fwd benchmark with 64byte packets

Chris Pappas chrispappas12 at gmail.com
Fri Jan 24 17:30:19 CET 2014


Hi,

we are benchmarking DPDK l2fwd performance by using DPDK Pktgen (both up to
date). We have connected two server machines back-to-back, and each machine
is a dual-socket server with 6 dual-port 10G NICs (12 ports in total with
120 Gbps). Four of the NICs (8 ports in total) are connected to socket 0
and the other two (4 ports in total) are connected to socket 1. With 1500
byte packets we saturate line rate, however, with 64 byte packets we do not.

By running l2fwd (./l2fwd -c 0xff0f -n 4 -- -p 0xfff) we get following
performance reported by Pktgen:

Rx/Tx
7386/9808  7386/9807  7413/9837  7413/9827   7397/9816   7397/9822
7400/9823  7400/9823  7394/9820  7394/9807   7372/9768   7372/9788

L2fwd reports 0 dropped packets in total.
Another observation is that Pktgen does not saturate exactly the line rate
as for 1500 byte packets we observe exactly 10 Gbps Tx.

* The way the coremask (-c) works is quite clear (for our case the 4 LSB
are cores of socket 0, the next 4 LSB of socket 1, then socket 0 and socket
1 again). However, the port mask only defines which NICs are enabled and we
would like to know how do we ensure that the cores that are assigned to the
NICs are on the same socket as the corresponding NICs, or is this done
automatically?

The command we use to run l2fwd is the following:
./l2fwd -c 0xff0f -n 4 -- -p 0xfff

* The next observation is that if we run again l2fwd with a different
coremask and enable all our cores (./l2fwd -c 0xffff -n 4 -- -p 0xfff),
performance drops significantly, and results are the following:

Rx/Tx
7380/9807  7380/9806  7422/9850  7423/9789  2467/9585  2467/9624
1399/9809  1399/9806  7391/9816  7392/9802  7370/9789  7370/9789

We observe that ports P4-P7 have a very low throughput, and they correspond
to the cores we enabled in the coremask. This result seems weird and make
the assignment of cores to NICs seem as a logical explanation. Moreover,
l2fwd reports many dropped packets only for these 4 NICs.

We would like to know if there is an obvious mistake in our configuration,
or if there are some steps we can take to debug this. 6Wind reports a
platform limit of 160 Mpps, but we are below this with a similar platform.
Is the bottleneck the PCIe?

Thank you in advance for your time.

Best regards,
Chris Pappas


More information about the dev mailing list