[dpdk-dev] [PATCH v5 1/3] lib: add Generic Receive Offload API framework

Tan, Jianfeng jianfeng.tan at intel.com
Mon Jun 19 17:43:14 CEST 2017



On 6/18/2017 3:21 PM, Jiayu Hu wrote:
> Generic Receive Offload (GRO) is a widely used SW-based offloading
> technique to reduce per-packet processing overhead. It gains
> performance by reassembling small packets into large ones. This
> patchset is to support GRO in DPDK. To support GRO, this patch
> implements a GRO API framework.
>
> To enable more flexibility to applications, DPDK GRO is implemented as
> a user library. Applications explicitly use the GRO library to merge
> small packets into large ones. DPDK GRO provides two reassembly modes.
> One is called lightweigth mode, the other is called heavyweight mode.
> If applications want merge packets in a simple way, they can use
> lightweigth mode. If applications need more fine-grained controls,
> they can choose heavyweigth mode.
>
> rte_gro_reassemble_burst is the main reassembly API which is used in
> lightweigth mode and processes N packets at a time. For applications,
> performing GRO in lightweigth mode is simple. They just need to invoke
> rte_gro_reassemble_burst. Applications can get GROed packets as soon as
> rte_gro_reassemble_burst returns.
>
> rte_gro_reassemble is the main reassembly API which is used in
> lightweigth mode and processes one packet at a time. For applications,
> performing GRO in heavyweigth mode is relatively complicated. Before
> performing GRO, applications need to create a GRO table by
> rte_gro_tbl_create. Then they can use rte_gro_reassemble to process
> packets one by one. The processed packets are in the GRO table. If
> applications want to get them, applications need to manually flush
> them by flush APIs.

For these two APIs, I suppose they will try best to reassemble the 
packets according to the supported GRO engine. So we need to call all 
GRO engines according to the ptype of this packet. And this framework 
should be implemented in this file.

>
> In DPDK GRO, different GRO types define own reassembly tables. When
> create a GRO table, it keeps the reassembly tables of desired GRO types.
> To process one packet, we search for the corresponding reassembly table
> according to the packet type first. Then search for the reassembly table
> to find an existed packet to merge. If find, chain the two packets
> together. If not find, insert the packet into the reassembly table. If
> the packet is with wrong checksum, or is fragmented etc., error happens.
> The reassebly function will stop processing the packet.
>
> Signed-off-by: Jiayu Hu <jiayu.hu at intel.com>
> ---
>   config/common_base       |   5 ++
>   lib/Makefile             |   1 +
>   lib/librte_gro/Makefile  |  50 +++++++++++
>   lib/librte_gro/rte_gro.c | 126 ++++++++++++++++++++++++++++
>   lib/librte_gro/rte_gro.h | 213 +++++++++++++++++++++++++++++++++++++++++++++++
>   mk/rte.app.mk            |   1 +
>   6 files changed, 396 insertions(+)
>   create mode 100644 lib/librte_gro/Makefile
>   create mode 100644 lib/librte_gro/rte_gro.c
>   create mode 100644 lib/librte_gro/rte_gro.h

If we expose some APIs, we always add a rte_vhost_version.map file in 
that directory.

>
> diff --git a/config/common_base b/config/common_base
> index f6aafd1..167f5ef 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -712,6 +712,11 @@ CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
>   CONFIG_RTE_LIBRTE_PMD_VHOST=n
>   
>   #
> +# Compile GRO library
> +#
> +CONFIG_RTE_LIBRTE_GRO=y
> +
> +#
>   #Compile Xen domain0 support
>   #
>   CONFIG_RTE_LIBRTE_XEN_DOM0=n
> diff --git a/lib/Makefile b/lib/Makefile
> index 07e1fd0..e253053 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -106,6 +106,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
>   DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
>   DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
>   DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
> +DIRS-$(CONFIG_RTE_LIBRTE_GRO) += librte_gro
>   
>   ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>   DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile
> new file mode 100644
> index 0000000..9f4063a
> --- /dev/null
> +++ b/lib/librte_gro/Makefile
> @@ -0,0 +1,50 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_gro.a
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> +
> +EXPORT_MAP := rte_gro_version.map
> +
> +LIBABIVER := 1
> +
> +# source files
> +SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_GRO)-include += rte_gro.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c
> new file mode 100644
> index 0000000..1bc53a2
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro.c
> @@ -0,0 +1,126 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.

The year should be 2017. The same to other files.

> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <rte_malloc.h>
> +#include <rte_mbuf.h>
> +
> +#include "rte_gro.h"
> +
> +static gro_tbl_create_fn tbl_create_functions[GRO_TYPE_MAX_NB];

> +static gro_tbl_destroy_fn tbl_destroy_functions[GRO_TYPE_MAX_NB];
> +
> +struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
> +		uint16_t max_flow_num,
> +		uint16_t max_item_per_flow,
> +		uint32_t max_packet_size,
> +		uint64_t max_timeout_cycles,
> +		uint64_t desired_gro_types)
> +{
> +	gro_tbl_create_fn create_tbl_fn;
> +	struct rte_gro_tbl *gro_tbl;
> +	uint64_t gro_type_flag = 0;
> +	uint8_t i;
> +
> +	gro_tbl = rte_zmalloc_socket(__func__,
> +			sizeof(struct rte_gro_tbl),
> +			RTE_CACHE_LINE_SIZE,
> +			socket_id);
> +	gro_tbl->max_packet_size = max_packet_size;
> +	gro_tbl->max_timeout_cycles = max_timeout_cycles;
> +	gro_tbl->desired_gro_types = desired_gro_types;
> +
> +	for (i = 0; i < GRO_TYPE_MAX_NB; i++) {
> +		gro_type_flag = 1 << i;
> +		if (desired_gro_types & gro_type_flag) {
> +			create_tbl_fn = tbl_create_functions[i];
> +			if (create_tbl_fn)
> +				create_tbl_fn(socket_id,
> +						max_flow_num,
> +						max_item_per_flow);
> +			else
> +				gro_tbl->tbls[i] = NULL;
> +		}
> +	}
> +	return gro_tbl;
> +}
> +
> +void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl)
> +{
> +	gro_tbl_destroy_fn destroy_tbl_fn;
> +	uint64_t gro_type_flag;
> +	uint8_t i;
> +
> +	if (gro_tbl == NULL)
> +		return;
> +	for (i = 0; i < GRO_TYPE_MAX_NB; i++) {
> +		gro_type_flag = 1 << i;
> +		if (gro_tbl->desired_gro_types & gro_type_flag) {
> +			destroy_tbl_fn = tbl_destroy_functions[i];
> +			if (destroy_tbl_fn)
> +				destroy_tbl_fn(gro_tbl->tbls[i]);
> +			gro_tbl->tbls[i] = NULL;
> +		}
> +	}
> +	rte_free(gro_tbl);
> +}
> +
> +uint16_t
> +rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused,
> +		const uint16_t nb_pkts,
> +		const struct rte_gro_param param __rte_unused)
> +{
> +	return nb_pkts;
> +}
> +
> +int rte_gro_reassemble(struct rte_mbuf *pkt __rte_unused,
> +		struct rte_gro_tbl *gro_tbl __rte_unused)
> +{
> +	return -1;
> +}
> +
> +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> +		uint64_t desired_gro_types __rte_unused,
> +		uint16_t flush_num __rte_unused,
> +		struct rte_mbuf **out __rte_unused,
> +		const uint16_t max_nb_out __rte_unused)
> +{
> +	return 0;
> +}
> +
> +uint16_t
> +rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> +		uint64_t desired_gro_types __rte_unused,
> +		struct rte_mbuf **out __rte_unused,
> +		const uint16_t max_nb_out __rte_unused)
> +{
> +	return 0;
> +}
> diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h
> new file mode 100644
> index 0000000..67bd90d
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro.h
> @@ -0,0 +1,213 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_GRO_H_
> +#define _RTE_GRO_H_
> +
> +/* maximum number of supported GRO types */
> +#define GRO_TYPE_MAX_NB 64
> +#define GRO_TYPE_SUPPORT_NB 0	/**< current supported GRO num */
> +
> +/**
> + * GRO table structure. DPDK GRO uses GRO table to reassemble
> + * packets. In heightweight mode, applications must create GRO tables
> + * before performing GRO. However, in lightweight mode, applications
> + * don't need create GRO tables.
> + *
> + * A GRO table object stores many reassembly tables of desired
> + * GRO types.
> + */
> +struct rte_gro_tbl {
> +	/* table addresses of desired GRO types */
> +	void *tbls[GRO_TYPE_MAX_NB];
> +	uint64_t desired_gro_types;	/**< GRO types that want to perform */
> +	/**
> +	 * the maximum time of packets staying in GRO tables, measured in
> +	 * nanosecond.
> +	 */
> +	uint64_t max_timeout_cycles;
> +	/* the maximum length of merged packet, measured in byte */
> +	uint32_t max_packet_size;
> +};
> +
> +/**
> + * In lightweihgt mode, applications use this strcuture to pass the
> + * needed parameters to rte_gro_reassemble_burst.
> + */
> +struct rte_gro_param {
> +	uint16_t max_flow_num;	/**< max flow number */
> +	uint16_t max_item_per_flow;	/**< max item number per flow */
> +	/**
> +	 * It indicates the GRO types that applications want to perform,
> +	 * whose value is the result of OR operation on GRO type flags.
> +	 */
> +	uint64_t desired_gro_types;
> +	/* the maximum packet size after been merged */
> +	uint32_t max_packet_size;
> +};
> +
> +typedef void *(*gro_tbl_create_fn)(uint16_t socket_id,
> +		uint16_t max_flow_num,
> +		uint16_t max_item_per_flow);
> +typedef void (*gro_tbl_destroy_fn)(void *tbl);
> +
> +/**
> + * This function create a GRO table, which is used to merge packets.
> + *
> + * @param socket_id
> + *  socket index where the Ethernet port connects to.
> + * @param max_flow_num
> + *  the maximum flow number in the GRO table
> + * @param max_item_per_flow
> + *  the maximum packet number per flow
> + * @param max_packet_size
> + *  the maximum size of merged packets, which is measured in byte.
> + * @param max_timeout_cycles
> + *  the maximum time that a packet can stay in the GRO table.
> + * @param desired_gro_types
> + *  GRO types that applications want to perform. It's value is the
> + *  result of OR operation on desired GRO type flags.
> + * @return
> + *  If create successfully, return a pointer which points to the GRO
> + *  table. Otherwise, return NULL.
> + */
> +struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
> +		uint16_t max_flow_num,
> +		uint16_t max_item_per_flow,
> +		uint32_t max_packet_size,
> +		uint64_t max_timeout_cycles,
> +		uint64_t desired_gro_types);

Strange, I did not see any where you use this API. And what's more, is 
it really necessary to make these two, create and destroy the table, as 
APIs?

> +/**
> + * This function destroys a GRO table.
> + */
> +void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl);

If this is a API to be used by users, please clarify how and when to use it.

> +
> +/**
> + * This is the main reassembly API used in lightweight mode, which
> + * merges numbers of packets at a time. After it returns, applications
> + * can get GROed packets immediately. Applications don't need to
> + * flush packets manually. In lightweight mode, applications just need
> + * to tell the reassembly API what rules should be applied when merge
> + * packets. Therefore, applications can perform GRO in very a simple
> + * way.
> + *
> + * To process one packet, we find its corresponding reassembly table
> + * according to the packet type. Then search for the reassembly table
> + * to find one packet to merge. If find, chain the two packets together.
> + * If not find, insert the inputted packet into the reassembly table.
> + * Besides, to merge two packets is to chain them together. No
> + * memory copy is needed. Before rte_gro_reassemble_burst returns,
> + * header checksums of merged packets are re-calculated.
> + *
> + * @param pkts
> + *  a pointer array which points to the packets to reassemble. After
> + *  GRO, it is also used to keep GROed packets.
> + * @param nb_pkts
> + *  the number of packets to reassemble.
> + * @param param
> + *  Applications use param to tell rte_gro_reassemble_burst what rules
> + *  are demanded.
> + * @return
> + *  the number of packets after GROed.
> + */
> +uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused,
> +		const uint16_t nb_pkts __rte_unused,
> +		const struct rte_gro_param param __rte_unused);
> +
> +/**
> + * This is the main reassembly API used in heavyweight mode, which
> + * merges one packet at a time. The procedure of merging one packet is
> + * similar with rte_gro_reassemble_burst. But rte_gro_reassemble will
> + * not update header checksums. Header checksums of merged packets are
> + * re-calculated in flush APIs.
> + *
> + * If error happens, like packet with error checksum and with
> + * unsupported GRO types, the inputted packet won't be stored in GRO
> + * table. If no errors happen, the packet is either merged with an
> + * existed packet, or inserted into its corresponding reassembly table.
> + * Applications can get packets in the GRO table by flush APIs.
> + *
> + * @param pkt
> + *  packet to reassemble.
> + * @param gro_tbl
> + *  a pointer points to a GRO table.
> + * @return
> + *  if merge the packet successfully, return a positive value. If fail
> + *  to merge, return zero. If errors happen, return a negative value.
> + */
> +int rte_gro_reassemble(struct rte_mbuf *pkt __rte_unused,
> +		struct rte_gro_tbl *gro_tbl __rte_unused);
> +
> +/**
> + * This function flushed packets of desired GRO types from their
> + * corresponding reassembly tables.
> + *
> + * @param gro_tbl
> + *  a pointer points to a GRO table object.
> + * @param desired_gro_types
> + *  GRO types whose packets will be flushed.
> + * @param flush_num
> + *  the number of packets that need flushing.
> + * @param out
> + *  a pointer array that is used to keep flushed packets.
> + * @param nb_out
> + *  the size of out.
> + * @return
> + *  the number of flushed packets. If no packets are flushed, return 0.
> + */
> +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> +		uint64_t desired_gro_types __rte_unused,
> +		uint16_t flush_num __rte_unused,
> +		struct rte_mbuf **out __rte_unused,
> +		const uint16_t max_nb_out __rte_unused);

Still, don't see anywhere to call this function. How can we make sure it 
correct then?

> +
> +/**
> + * This function flushes the timeout packets from reassembly tables of
> + * desired GRO types.
> + *
> + * @param gro_tbl
> + *  a pointer points to a GRO table object.
> + * @param desired_gro_types
> + * rte_gro_timeout_flush only processes packets which belong to the
> + * GRO types specified by desired_gro_types.
> + * @param out
> + *  a pointer array that is used to keep flushed packets.
> + * @param nb_out
> + *  the size of out.
> + * @return
> + *  the number of flushed packets. If no packets are flushed, return 0.
> + */
> +uint16_t rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> +		uint64_t desired_gro_types __rte_unused,
> +		struct rte_mbuf **out __rte_unused,
> +		const uint16_t max_nb_out __rte_unused);
> +#endif
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index bcaf1b3..fc3776d 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -98,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
>   _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
>   _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
>   _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)        	+= -lrte_gro
>   
>   ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>   _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni



More information about the dev mailing list