[PATCH v8 1/3] ethdev: introduce protocol hdr based buffer split

Andrew Rybchenko andrew.rybchenko at oktetlabs.ru
Thu Jun 2 15:20:33 CEST 2022


Is it the right one since it is listed in patchwork?

On 6/1/22 16:50, wenxuanx.wu at intel.com wrote:
> From: Wenxuan Wu <wenxuanx.wu at intel.com>
> 
> Currently, Rx buffer split supports length based split. With Rx queue
> offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment
> configured, PMD will be able to split the received packets into
> multiple segments.
> 
> However, length based buffer split is not suitable for NICs that do split
> based on protocol headers. Given a arbitrarily variable length in Rx packet

a -> an

> segment, it is almost impossible to pass a fixed protocol header to PMD.
> Besides, the existence of tunneling results in the composition of a packet
> is various, which makes the situation even worse.
> 
> This patch extends current buffer split to support protocol header based
> buffer split. A new proto_hdr field is introduced in the reserved field
> of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr
> field defines the split position of packet, splitting will always happens
> after the protocol header defined in the Rx packet segment. When Rx queue
> offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
> protocol header is configured, PMD will split the ingress packets into
> multiple segments.
> 
> struct rte_eth_rxseg_split {
> 
>          struct rte_mempool *mp; /* memory pools to allocate segment from */
>          uint16_t length; /* segment maximal data length,
>                              configures "split point" */
>          uint16_t offset; /* data offset from beginning
>                              of mbuf data buffer */
>          uint32_t proto_hdr; /* inner/outer L2/L3/L4 protocol header,
> 			       configures "split point" */
>      };
> 
> Both inner and outer L2/L3/L4 level protocol header split can be supported.
> Corresponding protocol header capability is RTE_PTYPE_L2_ETHER,
> RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV6, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP,
> RTE_PTYPE_L4_SCTP, RTE_PTYPE_INNER_L2_ETHER, RTE_PTYPE_INNER_L3_IPV4,
> RTE_PTYPE_INNER_L3_IPV6, RTE_PTYPE_INNER_L4_TCP, RTE_PTYPE_INNER_L4_UDP,
> RTE_PTYPE_INNER_L4_SCTP.
> 
> For example, let's suppose we configured the Rx queue with the
> following segments:
>      seg0 - pool0, proto_hdr0=RTE_PTYPE_L3_IPV4, off0=2B
>      seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B
>      seg2 - pool2, off1=0B
> 
> The packet consists of MAC_IPV4_UDP_PAYLOAD will be split like
> following:
>      seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
>      seg1 - udp header @ 128 in mbuf from pool1
>      seg2 - payload @ 0 in mbuf from pool2

It must be defined how ICMPv4 packets will be split in such case.
And how UDP over IPv6 will be split.
> 
> Now buffet split can be configured in two modes. For length based
> buffer split, the mp, length, offset field in Rx packet segment should
> be configured, while the proto_hdr field should not be configured.
> For protocol header based buffer split, the mp, offset, proto_hdr field
> in Rx packet segment should be configured, while the length field should
> not be configured.
> 
> The split limitations imposed by underlying PMD is reported in the
> rte_eth_dev_info->rx_seg_capa field. The memory attributes for the split
> parts may differ either, dpdk memory and external memory, respectively.
> 
> Signed-off-by: Xuan Ding <xuan.ding at intel.com>
> Signed-off-by: Yuan Wang <yuanx.wang at intel.com>
> Signed-off-by: Wenxuan Wu <wenxuanx.wu at intel.com>
> Reviewed-by: Qi Zhang <qi.z.zhang at intel.com>
> Acked-by: Ray Kinsella <mdr at ashroe.eu>
> ---
>   lib/ethdev/rte_ethdev.c | 40 +++++++++++++++++++++++++++++++++-------
>   lib/ethdev/rte_ethdev.h | 28 +++++++++++++++++++++++++++-
>   2 files changed, 60 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index 29a3d80466..fbd55cdd9d 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -1661,6 +1661,7 @@ rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
>   		struct rte_mempool *mpl = rx_seg[seg_idx].mp;
>   		uint32_t length = rx_seg[seg_idx].length;
>   		uint32_t offset = rx_seg[seg_idx].offset;
> +		uint32_t proto_hdr = rx_seg[seg_idx].proto_hdr;
>   
>   		if (mpl == NULL) {
>   			RTE_ETHDEV_LOG(ERR, "null mempool pointer\n");
> @@ -1694,13 +1695,38 @@ rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
>   		}
>   		offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM;
>   		*mbp_buf_size = rte_pktmbuf_data_room_size(mpl);
> -		length = length != 0 ? length : *mbp_buf_size;
> -		if (*mbp_buf_size < length + offset) {
> -			RTE_ETHDEV_LOG(ERR,
> -				       "%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
> -				       mpl->name, *mbp_buf_size,
> -				       length + offset, length, offset);
> -			return -EINVAL;
> +		if (proto_hdr == RTE_PTYPE_UNKNOWN) {
> +			/* Split at fixed length. */
> +			length = length != 0 ? length : *mbp_buf_size;
> +			if (*mbp_buf_size < length + offset) {
> +				RTE_ETHDEV_LOG(ERR,
> +					"%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
> +					mpl->name, *mbp_buf_size,
> +					length + offset, length, offset);
> +				return -EINVAL;
> +			}
> +		} else {
> +			/* Split after specified protocol header. */
> +			if (!(proto_hdr & RTE_BUFFER_SPLIT_PROTO_HDR_MASK)) {

The condition looks suspicious. It will be true if proto_hdr has no
single bit from the mask. I guess it is not the intent.
I guess the condition should be
   proto_hdr & ~RTE_BUFFER_SPLIT_PROTO_HDR_MASK
i.e. there is unsupported bits in proto_hdr

IMHO we need extra field in dev_info to report supported protocols to
split on. Or a new API to get an array similar to ptype get.
May be a new API is a better choice to not overload dev_info and to
be more flexible in reporting.

> +				RTE_ETHDEV_LOG(ERR,
> +					"Protocol header %u not supported)\n",
> +					proto_hdr);

I think it would be useful to log unsupported bits only, if we say so.

> +				return -EINVAL;
> +			}
> +
> +			if (length != 0) {
> +				RTE_ETHDEV_LOG(ERR, "segment length should be set to zero in protocol header "
> +					       "based buffer split\n");
> +				return -EINVAL;
> +			}
> +
> +			if (*mbp_buf_size < offset) {
> +				RTE_ETHDEV_LOG(ERR,
> +						"%s mbuf_data_room_size %u < %u segment offset)\n",
> +						mpl->name, *mbp_buf_size,
> +						offset);
> +				return -EINVAL;
> +			}
>   		}
>   	}
>   	return 0;
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 04cff8ee10..0cd9dd6cc0 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1187,6 +1187,9 @@ struct rte_eth_txmode {
>    *   mbuf) the following data will be pushed to the next segment
>    *   up to its own length, and so on.
>    *
> + * - The proto_hdrs in the elements define the split position of
> + *   received packets.
> + *
>    * - If the length in the segment description element is zero
>    *   the actual buffer size will be deduced from the appropriate
>    *   memory pool properties.
> @@ -1197,14 +1200,37 @@ struct rte_eth_txmode {
>    *     - pool from the last valid element
>    *     - the buffer size from this pool
>    *     - zero offset
> + *
> + * - Length based buffer split:
> + *     - mp, length, offset should be configured.
> + *     - The proto_hdr field should not be configured.
> + *
> + * - Protocol header based buffer split:
> + *     - mp, offset, proto_hdr should be configured.
> + *     - The length field should not be configured.
>    */
>   struct rte_eth_rxseg_split {
>   	struct rte_mempool *mp; /**< Memory pool to allocate segment from. */
>   	uint16_t length; /**< Segment data length, configures split point. */
>   	uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */
> -	uint32_t reserved; /**< Reserved field. */
> +	uint32_t proto_hdr; /**< Inner/outer L2/L3/L4 protocol header, configures split point. */
>   };
>   
> +/* Buffer split protocol header capability. */
> +#define RTE_BUFFER_SPLIT_PROTO_HDR_MASK ( \
> +	RTE_PTYPE_L2_ETHER | \
> +	RTE_PTYPE_L3_IPV4 | \
> +	RTE_PTYPE_L3_IPV6 | \
> +	RTE_PTYPE_L4_TCP | \
> +	RTE_PTYPE_L4_UDP | \
> +	RTE_PTYPE_L4_SCTP | \
> +	RTE_PTYPE_INNER_L2_ETHER | \
> +	RTE_PTYPE_INNER_L3_IPV4 | \
> +	RTE_PTYPE_INNER_L3_IPV6 | \
> +	RTE_PTYPE_INNER_L4_TCP | \
> +	RTE_PTYPE_INNER_L4_UDP | \
> +	RTE_PTYPE_INNER_L4_SCTP)
> +
>   /**
>    * @warning
>    * @b EXPERIMENTAL: this structure may change without prior notice.



More information about the dev mailing list