[dpdk-dev] [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework

Kavanagh, Mark B mark.b.kavanagh at intel.com
Wed Oct 4 15:21:58 CEST 2017



>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 2:11 PM
>To: Kavanagh, Mark B <mark.b.kavanagh at intel.com>; dev at dpdk.org
>Cc: Hu, Jiayu <jiayu.hu at intel.com>; Tan, Jianfeng <jianfeng.tan at intel.com>;
>Yigit, Ferruh <ferruh.yigit at intel.com>; thomas at monjalon.net
>Subject: RE: [PATCH v6 1/6] gso: add Generic Segmentation Offload API
>framework
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Monday, October 2, 2017 5:46 PM
>> To: dev at dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu at intel.com>; Tan, Jianfeng <jianfeng.tan at intel.com>;
>Ananyev, Konstantin <konstantin.ananyev at intel.com>; Yigit,
>> Ferruh <ferruh.yigit at intel.com>; thomas at monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh at intel.com>
>> Subject: [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework
>>
>> From: Jiayu Hu <jiayu.hu at intel.com>
>>
>> Generic Segmentation Offload (GSO) is a SW technique to split large
>> packets into small ones. Akin to TSO, GSO enables applications to
>> operate on large packets, thus reducing per-packet processing overhead.
>>
>> To enable more flexibility to applications, DPDK GSO is implemented
>> as a standalone library. Applications explicitly use the GSO library
>> to segment packets. To segment a packet requires two steps. The first
>> is to set proper flags to mbuf->ol_flags, where the flags are the same
>> as that of TSO. The second is to call the segmentation API,
>> rte_gso_segment(). This patch introduces the GSO API framework to DPDK.
>>
>> rte_gso_segment() splits an input packet into small ones in each
>> invocation. The GSO library refers to these small packets generated
>> by rte_gso_segment() as GSO segments. Each of the newly-created GSO
>> segments is organized as a two-segment MBUF, where the first segment is a
>> standard MBUF, which stores a copy of packet header, and the second is an
>> indirect MBUF which points to a section of data in the input packet.
>> rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
>> when all GSO segments are freed, the input packet is freed automatically.
>> Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
>> the driver of the interface which the GSO segments are sent to should
>> support to transmit multi-segment packets.
>>
>> The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
>> packet, and all produced GSO segments in the event of success, since
>> segmentation in hardware is no longer required at that point.
>>
>> Signed-off-by: Jiayu Hu <jiayu.hu at intel.com>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
>> ---
>>  config/common_base                     |   5 ++
>>  doc/api/doxy-api-index.md              |   1 +
>>  doc/api/doxy-api.conf                  |   1 +
>>  doc/guides/rel_notes/release_17_11.rst |   1 +
>>  lib/Makefile                           |   2 +
>>  lib/librte_gso/Makefile                |  49 +++++++++++
>>  lib/librte_gso/rte_gso.c               |  52 ++++++++++++
>>  lib/librte_gso/rte_gso.h               | 145
>+++++++++++++++++++++++++++++++++
>>  lib/librte_gso/rte_gso_version.map     |   7 ++
>>  mk/rte.app.mk                          |   1 +
>>  10 files changed, 264 insertions(+)
>>  create mode 100644 lib/librte_gso/Makefile
>>  create mode 100644 lib/librte_gso/rte_gso.c
>>  create mode 100644 lib/librte_gso/rte_gso.h
>>  create mode 100644 lib/librte_gso/rte_gso_version.map
>>
>> diff --git a/config/common_base b/config/common_base
>> index 12f6be9..58ca5c0 100644
>> --- a/config/common_base
>> +++ b/config/common_base
>> @@ -653,6 +653,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
>>  CONFIG_RTE_LIBRTE_GRO=y
>>
>>  #
>> +# Compile GSO library
>> +#
>> +CONFIG_RTE_LIBRTE_GSO=y
>> +
>> +#
>>  # Compile librte_meter
>>  #
>>  CONFIG_RTE_LIBRTE_METER=y
>> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
>> index 19e0d4f..6512918 100644
>> --- a/doc/api/doxy-api-index.md
>> +++ b/doc/api/doxy-api-index.md
>> @@ -101,6 +101,7 @@ The public API headers are grouped by topics:
>>    [TCP]                (@ref rte_tcp.h),
>>    [UDP]                (@ref rte_udp.h),
>>    [GRO]                (@ref rte_gro.h),
>> +  [GSO]                (@ref rte_gso.h),
>>    [frag/reass]         (@ref rte_ip_frag.h),
>>    [LPM IPv4 route]     (@ref rte_lpm.h),
>>    [LPM IPv6 route]     (@ref rte_lpm6.h),
>> diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
>> index 823554f..408f2e6 100644
>> --- a/doc/api/doxy-api.conf
>> +++ b/doc/api/doxy-api.conf
>> @@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
>>                            lib/librte_ether \
>>                            lib/librte_eventdev \
>>                            lib/librte_gro \
>> +                          lib/librte_gso \
>>                            lib/librte_hash \
>>                            lib/librte_ip_frag \
>>                            lib/librte_jobstats \
>> diff --git a/doc/guides/rel_notes/release_17_11.rst
>b/doc/guides/rel_notes/release_17_11.rst
>> index 8bf91bd..7508be7 100644
>> --- a/doc/guides/rel_notes/release_17_11.rst
>> +++ b/doc/guides/rel_notes/release_17_11.rst
>> @@ -174,6 +174,7 @@ The libraries prepended with a plus sign were
>incremented in this version.
>>       librte_ethdev.so.7
>>       librte_eventdev.so.2
>>       librte_gro.so.1
>> +   + librte_gso.so.1
>>       librte_hash.so.2
>>       librte_ip_frag.so.1
>>       librte_jobstats.so.1
>> diff --git a/lib/Makefile b/lib/Makefile
>> index 86caba1..3d123f4 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
>>  DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
>>  DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
>>  DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
>> +DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
>> +DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
>>
>>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>>  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
>> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
>> new file mode 100644
>> index 0000000..aeaacbc
>> --- /dev/null
>> +++ b/lib/librte_gso/Makefile
>> @@ -0,0 +1,49 @@
>> +#   BSD LICENSE
>> +#
>> +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> +#   All rights reserved.
>> +#
>> +#   Redistribution and use in source and binary forms, with or without
>> +#   modification, are permitted provided that the following conditions
>> +#   are met:
>> +#
>> +#     * Redistributions of source code must retain the above copyright
>> +#       notice, this list of conditions and the following disclaimer.
>> +#     * Redistributions in binary form must reproduce the above copyright
>> +#       notice, this list of conditions and the following disclaimer in
>> +#       the documentation and/or other materials provided with the
>> +#       distribution.
>> +#     * Neither the name of Intel Corporation nor the names of its
>> +#       contributors may be used to endorse or promote products derived
>> +#       from this software without specific prior written permission.
>> +#
>> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> +
>> +include $(RTE_SDK)/mk/rte.vars.mk
>> +
>> +# library name
>> +LIB = librte_gso.a
>> +
>> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
>> +
>> +EXPORT_MAP := rte_gso_version.map
>> +
>> +LIBABIVER := 1
>> +
>> +#source files
>> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>> +
>> +# install this header file
>> +SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
>> +
>> +include $(RTE_SDK)/mk/rte.lib.mk
>> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
>> new file mode 100644
>> index 0000000..b773636
>> --- /dev/null
>> +++ b/lib/librte_gso/rte_gso.c
>> @@ -0,0 +1,52 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#include <errno.h>
>> +
>> +#include "rte_gso.h"
>> +
>> +int
>> +rte_gso_segment(struct rte_mbuf *pkt,
>> +		const struct rte_gso_ctx *gso_ctx,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out)
>> +{
>> +	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
>> +			nb_pkts_out < 1)
>> +		return -EINVAL;
>> +
>> +	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> +	pkts_out[0] = pkt;
>> +
>> +	return 1;
>> +}
>> diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
>> new file mode 100644
>> index 0000000..53725e6
>> --- /dev/null
>> +++ b/lib/librte_gso/rte_gso.h
>> @@ -0,0 +1,145 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#ifndef _RTE_GSO_H_
>> +#define _RTE_GSO_H_
>> +
>> +/**
>> + * @file
>> + * Interface to GSO library
>> + */
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <stdint.h>
>> +#include <rte_mbuf.h>
>> +
>> +/* GSO IP id flags for the IPv4 header */
>> +#define RTE_GSO_IPID_FIXED (1ULL << 0)
>> +/**< Use fixed IP ids for output GSO segments. Setting
>> + * !RTE_GSO_IPID_FIXED indicates using incremental IP ids.
>> + */
>> +
>> +/**
>> + * GSO context structure.
>> + */
>> +struct rte_gso_ctx {
>> +	struct rte_mempool *direct_pool;
>> +	/**< MBUF pool for allocating direct buffers, which are used
>> +	 * to store packet headers for GSO segments.
>> +	 */
>> +	struct rte_mempool *indirect_pool;
>> +	/**< MBUF pool for allocating indirect buffers, which are used
>> +	 * to locate packet payloads for GSO segments. The indirect
>> +	 * buffer doesn't contain any data, but simply points to an
>> +	 * offset within the packet to segment.
>> +	 */
>> +	uint64_t ipid_flag;
>> +	/**< flag to indicate GSO uses fixed or incremental IP ids for
>> +	 * IPv4 headers of output GSO segments. If applications want
>> +	 * fixed IP ids, set RTE_GSO_IPID_FIXED to ipid_flag. Conversely,
>> +	 * if applications want incremental IP ids, set !RTE_GSO_IPID_FIXED.
>> +	 */
>
>Minor nit - I think better to just name it 'flag' - as in future some other
>non IPID related flags might be added.
>Also make sure that all flag values have RTE_GSO_FLAG (or so prefixes),
>and move explanation of particular flag value to its definition.
>Konstantin

Will do Konstantin - thanks!


>
>> +	uint32_t gso_types;
>> +	/**< the bit mask of required GSO types. The GSO library
>> +	 * uses the same macros as that of describing device TX
>> +	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
>> +	 * gso_types.
>> +	 *
>> +	 * For example, if applications want to segment TCP/IPv4
>> +	 * packets, set DEV_TX_OFFLOAD_TCP_TSO in gso_types.
>> +	 */
>> +	uint16_t gso_size;
>> +	/**< maximum size of an output GSO segment, including packet
>> +	 * header and payload, measured in bytes.
>> +	 */
>> +};
>> +
>> +/**
>> + * Segmentation function, which supports processing of both single- and
>> + * multi- MBUF packets.
>> + *
>> + * Note that we refer to the packets that are segmented from the input
>> + * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
>> + * input packet has correct checksums, and doesn't update checksums for
>> + * output GSO segments. Additionally, it doesn't process IP fragment
>> + * packets.
>> + *
>> + * Before calling rte_gso_segment(), applications must set proper ol_flags
>> + * for the packet. The GSO library uses the same macros as that of TSO.
>> + * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 in ol_flags to segment
>> + * a TCP/IPv4 packet. If rte_gso_segment() succceds, the PKT_TX_TCP_SEG
>> + * flag is removed for all GSO segments and the input packet.
>> + *
>> + * Each of the newly-created GSO segments is organized as a two-segment
>> + * MBUF, where the first segment is a standard MBUF, which stores a copy
>> + * of packet header, and the second is an indirect MBUF which points to
>> + * a section of data in the input packet. Since each GSO segment has
>> + * multiple MBUFs (i.e. typically 2 MBUFs), the driver of the interface
>which
>> + * the GSO segments are sent to should support transmission of multi-
>segment
>> + * packets.
>> + *
>> + * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
>> + * when all GSO segments are freed, the input packet is freed
>automatically.
>> + *
>> + * If the memory space in pkts_out or MBUF pools is insufficient, this
>> + * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
>> + * and this function returns the number of output GSO segments filled in
>> + * pkts_out.
>> + *
>> + * @param pkt
>> + *  The packet mbuf to segment.
>> + * @param ctx
>> + *  GSO context object pointer.
>> + * @param pkts_out
>> + *  Pointer array used to store the MBUF addresses of output GSO
>> + *  segments, when rte_gso_segment() succeeds.
>> + * @param nb_pkts_out
>> + *  The max number of items that pkts_out can keep.
>> + *
>> + * @return
>> + *  - The number of GSO segments filled in pkts_out on success.
>> + *  - Return -ENOMEM if run out of memory in MBUF pools.
>> + *  - Return -EINVAL for invalid parameters.
>> + */
>> +int rte_gso_segment(struct rte_mbuf *pkt,
>> +		const struct rte_gso_ctx *ctx,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out);
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_GSO_H_ */
>> diff --git a/lib/librte_gso/rte_gso_version.map
>b/lib/librte_gso/rte_gso_version.map
>> new file mode 100644
>> index 0000000..e1fd453
>> --- /dev/null
>> +++ b/lib/librte_gso/rte_gso_version.map
>> @@ -0,0 +1,7 @@
>> +DPDK_17.11 {
>> +	global:
>> +
>> +	rte_gso_segment;
>> +
>> +	local: *;
>> +};
>> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
>> index c25fdd9..d4c9873 100644
>> --- a/mk/rte.app.mk
>> +++ b/mk/rte.app.mk
>> @@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
>> +_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
>> --
>> 1.9.3



More information about the dev mailing list