[dpdk-dev] [PATCH v6 0/3] Support TCP/IPv4 GRO in DPDK

Jiayu Hu jiayu.hu at intel.com
Mon Jun 26 03:35:07 CEST 2017


Hi Jianfeng,

On Mon, Jun 26, 2017 at 12:03:33AM +0800, Tan, Jianfeng wrote:
> 
> 
> On 6/23/2017 10:43 PM, Jiayu Hu wrote:
> > Generic Receive Offload (GRO) is a widely used SW-based offloading
> > technique to reduce per-packet processing overhead. It gains performance
> > by reassembling small packets into large ones. Therefore, we propose to
> > support GRO in DPDK.
> > 
> > To enable more flexibility to applications, DPDK GRO is implemented as
> > a user library. Applications explicitly use the GRO library to merge
> > small packets into large ones. DPDK GRO provides two reassembly modes:
> > lightweigth mode and heavyweight mode. If applications want to merge
> > packets in a simple way, they can select lightweight mode API. If
> > applications need more fine-grained controls, they can select heavyweigth
> > mode API.
> > 
> > This patchset is to support TCP/IPv4 GRO in DPDK. The first patch is to
> > provide a GRO API framework. The second patch is to support TCP/IPv4 GRO.
> > The last patch is to enable TCP/IPv4 GRO in testpmd.
> > 
> > We perform many iperf tests to see the performance gains from DPDK GRO.
> > 
> > The test environment is:
> > a. two 25Gbps physical ports (p0 and p1) are linked together. Assign p0
> > 	to one networking namespace and assign p1 to DPDK;
> > b. enable TSO for p0. Run iperf client on p0;
> > c. launch testpmd with p1 and a vhost-user port, and run it in csum
> > 	forwarding mode. Select TCP HW checksum calculation for the
> > 	vhost-user port in csum forwarding engine. And for better
> > 	performance, we select IPv4 and TCP HW checksum calculation for p1
> > 	too;
> > d. launch a VM with one CPU core and a virtio-net port. The VM OS is
> > 	ubuntu 16.04 whose virtio-net driver supports GRO. Enables RX csum
> > 	offloading and mrg_rxbuf for the VM. Iperf server runs in the VM;
> > e. to run iperf tests, we need to avoid the csum forwarding engine
> > 	compulsorily changes packet mac addresses. SO in our tests, we
> > 	comment these codes out (line701 ~ line704 in csumonly.c).
> > 
> > In each test, we run iperf with the following three configurations:
> > 	- single flow and single TCP stream
> > 	- multiple flows and single TCP stream
> > 	- single flow and parallel TCP streams
> 
> To  me, flow == TCP stream; so could you explain what does flow mean?

Sorry, I use inappropriate terms. 'flow' means TCP connection here. And
'multiple TCP streams' means parallel iperf-client threads.

Thanks,
Jiayu

> 
> > 
> > We run above iperf tests on three scenatios:
> > 	s1: disabling kernel GRO and enabling DPDK GRO
> > 	s2: disabling kernel GRO and disabling DPDK GRO
> > 	s3: enabling kernel GRO and disabling DPDK GRO
> > Comparing the throughput of s1 with s2, we can see the performance gains
> > from DPDK GRO. Comparing the throughput of s1 and s3, we can compare DPDK
> > GRO performance with kernel GRO performance.
> > 
> > Test results:
> > 	- DPDK GRO throughput is almost 2 times than the throughput of no
> > 		DPDK GRO and no kernel GRO;
> > 	- DPDK GRO throughput is almost 1.2 times than the throughput of
> > 		kernel GRO.
> > 
> > Change log
> > ==========
> > v6:
> > - avoid checksum validation and calculation
> > - enable to process IP fragmented packets
> > - add a command in testpmd
> > - update documents
> > - modify rte_gro_timeout_flush and rte_gro_reassemble_burst
> > - rename veriable name
> > v5:
> > - fix some bugs
> > - fix coding style issues
> > v4:
> > - implement DPDK GRO as an application-used library
> > - introduce lightweight and heavyweight working modes to enable
> > 	fine-grained controls to applications
> > - replace cuckoo hash tables with simpler table structure
> > v3:
> > - fix compilation issues.
> > v2:
> > - provide generic reassembly function;
> > - implement GRO as a device ability:
> > add APIs for devices to support GRO;
> > add APIs for applications to enable/disable GRO;
> > - update testpmd example.
> > 
> > Jiayu Hu (3):
> >    lib: add Generic Receive Offload API framework
> >    lib/gro: add TCP/IPv4 GRO support
> >    app/testpmd: enable TCP/IPv4 GRO
> > 
> >   app/test-pmd/cmdline.c                      | 125 +++++++++
> >   app/test-pmd/config.c                       |  37 +++
> >   app/test-pmd/csumonly.c                     |   5 +
> >   app/test-pmd/testpmd.c                      |   3 +
> >   app/test-pmd/testpmd.h                      |  11 +
> >   config/common_base                          |   5 +
> >   doc/guides/rel_notes/release_17_08.rst      |   7 +
> >   doc/guides/testpmd_app_ug/testpmd_funcs.rst |  34 +++
> >   lib/Makefile                                |   2 +
> >   lib/librte_gro/Makefile                     |  51 ++++
> >   lib/librte_gro/rte_gro.c                    | 221 ++++++++++++++++
> >   lib/librte_gro/rte_gro.h                    | 195 ++++++++++++++
> >   lib/librte_gro/rte_gro_tcp.c                | 393 ++++++++++++++++++++++++++++
> >   lib/librte_gro/rte_gro_tcp.h                | 188 +++++++++++++
> >   lib/librte_gro/rte_gro_version.map          |  12 +
> >   mk/rte.app.mk                               |   1 +
> >   16 files changed, 1290 insertions(+)
> >   create mode 100644 lib/librte_gro/Makefile
> >   create mode 100644 lib/librte_gro/rte_gro.c
> >   create mode 100644 lib/librte_gro/rte_gro.h
> >   create mode 100644 lib/librte_gro/rte_gro_tcp.c
> >   create mode 100644 lib/librte_gro/rte_gro_tcp.h
> >   create mode 100644 lib/librte_gro/rte_gro_version.map
> > 


More information about the dev mailing list