[PATCH v3 2/4] ethdev: introduce protocol hdr based buffer split

Andrew Rybchenko andrew.rybchenko at oktetlabs.ru
Mon Sep 12 13:47:06 CEST 2022


On 9/2/22 22:10, Yuan Wang wrote:
> Currently, Rx buffer split supports length based split. With Rx queue
> offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment
> configured, PMD will be able to split the received packets into
> multiple segments.
> 
> However, length based buffer split is not suitable for NICs that do split
> based on protocol headers. Given an arbitrarily variable length in Rx
> packet segment, it is almost impossible to pass a fixed protocol header to
> driver. Besides, the existence of tunneling results in the composition of
> a packet is various, which makes the situation even worse.
> 
> This patch extends current buffer split to support protocol header based
> buffer split. A new proto_hdr field is introduced in the reserved field
> of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr
> field defines the split position of packet, splitting will always happens
> after the protocol header defined in the Rx packet segment. When Rx queue
> offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
> protocol header is configured, driver will split the ingress packets into
> multiple segments.
> 
> struct rte_eth_rxseg_split {
>          struct rte_mempool *mp; /* memory pools to allocate segment from */
>          uint16_t length; /* segment maximal data length,
>                              configures split point */
>          uint16_t offset; /* data offset from beginning
>                              of mbuf data buffer */
>          uint32_t proto_hdr; /* supported ptype of a specific pmd,
>                                 configures split point.
> 			       It should be defined by RTE_PTYPE_*

If I understand correctly, the statement is a bit misleading
since it should be a bit mask of RTE_PTYPE_* defines. Not
exactly one RTE_PTYPE_*.

> 			     */
> };
> 
> If protocol header split can be supported by a PMD. The
> rte_eth_buffer_split_get_supported_hdr_ptypes function can
> be use to obtain a list of these protocol headers.
> 
> For example, let's suppose we configured the Rx queue with the
> following segments:
>          seg0 - pool0, proto_hdr0=RTE_PTYPE_L3_IPV4, off0=2B
>          seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B
>          seg2 - pool2, off1=0B
> 
> The packet consists of MAC_IPV4_UDP_PAYLOAD will be split like

What is MAC_IPV4_UDP_PAYLOAD? Do you mean ETH_IPV4_UDP_PAYLOAD?

> following:
>          seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
>          seg1 - udp header @ 128 in mbuf from pool1
>          seg2 - payload @ 0 in mbuf from pool2
> 
> Note: NIC will only do split when the packets exactly match all the
> protocol headers in the segments. For example, if ARP packets received
> with above config, the NIC won't do split for ARP packets since
> it does not contains ipv4 header and udp header.

You must define which mempool is used in the case.

> 
> Now buffer split can be configured in two modes. For length based
> buffer split, the mp, length, offset field in Rx packet segment should
> be configured, while the proto_hdr field will be ignored.
> For protocol header based buffer split, the mp, offset, proto_hdr field
> in Rx packet segment should be configured, while the length field will
> be ignored.
> 
> The split limitations imposed by underlying driver is reported in the
> rte_eth_dev_info->rx_seg_capa field. The memory attributes for the split
> parts may differ either, dpdk memory and external memory, respectively.
> 
> Signed-off-by: Yuan Wang <yuanx.wang at intel.com>
> Signed-off-by: Xuan Ding <xuan.ding at intel.com>
> Signed-off-by: Wenxuan Wu <wenxuanx.wu at intel.com>
> ---
>   doc/guides/rel_notes/release_22_11.rst |  5 +++
>   lib/ethdev/rte_ethdev.c                | 55 ++++++++++++++++++++------
>   lib/ethdev/rte_ethdev.h                | 17 +++++++-
>   3 files changed, 65 insertions(+), 12 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_22_11.rst b/doc/guides/rel_notes/release_22_11.rst
> index 4d90514a9a..f3b58c7895 100644
> --- a/doc/guides/rel_notes/release_22_11.rst
> +++ b/doc/guides/rel_notes/release_22_11.rst
> @@ -60,6 +60,11 @@ New Features
>     Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to get supported
>     header protocols of a PMD to split.
>   
> +* **Added protocol header based buffer split.**
> +  Ethdev: The ``reserved`` field in the  ``rte_eth_rxseg_split`` structure is
> +  replaced with ``proto_hdr`` to support protocol header based buffer split.
> +  User can choose length or protocol header to configure buffer split
> +  according to NIC's capability.

Add one more empty line to have two before the next sectoin.

>   
>   Removed Items
>   -------------
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index 093c577add..dfceb723ee 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -1635,9 +1635,10 @@ rte_eth_dev_is_removed(uint16_t port_id)
>   }
>   
>   static int
> -rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
> -			     uint16_t n_seg, uint32_t *mbp_buf_size,
> -			     const struct rte_eth_dev_info *dev_info)
> +rte_eth_rx_queue_check_split(uint16_t port_id,
> +			const struct rte_eth_rxseg_split *rx_seg,
> +			uint16_t n_seg, uint32_t *mbp_buf_size,
> +			const struct rte_eth_dev_info *dev_info)
>   {
>   	const struct rte_eth_rxseg_capa *seg_capa = &dev_info->rx_seg_capa;
>   	struct rte_mempool *mp_first;
> @@ -1660,6 +1661,7 @@ rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
>   		struct rte_mempool *mpl = rx_seg[seg_idx].mp;
>   		uint32_t length = rx_seg[seg_idx].length;
>   		uint32_t offset = rx_seg[seg_idx].offset;
> +		uint32_t proto_hdr = rx_seg[seg_idx].proto_hdr;
>   
>   		if (mpl == NULL) {
>   			RTE_ETHDEV_LOG(ERR, "null mempool pointer\n");
> @@ -1693,13 +1695,44 @@ rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
>   		}
>   		offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM;
>   		*mbp_buf_size = rte_pktmbuf_data_room_size(mpl);
> -		length = length != 0 ? length : *mbp_buf_size;
> -		if (*mbp_buf_size < length + offset) {
> -			RTE_ETHDEV_LOG(ERR,
> -				       "%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
> -				       mpl->name, *mbp_buf_size,
> -				       length + offset, length, offset);
> -			return -EINVAL;
> +
> +		int ret = rte_eth_buffer_split_get_supported_hdr_ptypes(port_id, NULL, 0);

Do not mix variable declaration and code.
It is better to give the variable some sensible name.
Otherwise else branch code is hard to read.

> +		if (ret <= 0) {

May be I'm missing something, but nothing prevetns a driver/HW
to support both protocol-based and fixed-length split.
So, ability to support protocol based split should be treated
as a request to do it. It must be based on rx_seg->proto_hdr
content (for all segments).

Also nothing should prevent to mix protocol and fixed-length
split. I.e. split just after UDP in the first segment,
40 bytes in the second segment, everything else in the third.

> +			/* Split at fixed length. */
> +			length = length != 0 ? length : *mbp_buf_size;
> +			if (*mbp_buf_size < length + offset) {
> +				RTE_ETHDEV_LOG(ERR,
> +					"%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
> +					mpl->name, *mbp_buf_size,
> +					length + offset, length, offset);
> +				return -EINVAL;
> +			}
> +		} else {
> +			/* Split after specified protocol header. */
> +			uint32_t ptypes[ret];
> +			int i;
> +
> +			ret = rte_eth_buffer_split_get_supported_hdr_ptypes(port_id, ptypes, ret);

In theory, the funciton could fail since input arguments
differ. So, it should be handled.

> +			for (i = 0; i < ret; i++)
> +				if (ptypes[i] & proto_hdr)

IMHO it should be ==, not &. I think that
rte_eth_buffer_split_get_supported_hdr_ptypes() should define
points at which split could happen and we should match the
point exactly.

> +					break;
> +
> +			if (i == ret) {
> +#define PTYPE_NAMESIZE	256

Why? It is looks really strange that it is defined here.

> +				char ptype_name[PTYPE_NAMESIZE];
> +				rte_get_ptype_name(proto_hdr, ptype_name, sizeof(ptype_name));
> +				RTE_ETHDEV_LOG(ERR,
> +					"Protocol header %s is not supported.\n",
> +					ptype_name);
> +				return -EINVAL;
> +			}
> +			if (*mbp_buf_size < offset) {
> +				RTE_ETHDEV_LOG(ERR,
> +						"%s mbuf_data_room_size %u < %u segment offset)\n",
> +						mpl->name, *mbp_buf_size,
> +						offset);
> +				return -EINVAL;
> +			}
>   		}
>   	}
>   	return 0;
> @@ -1778,7 +1811,7 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
>   		n_seg = rx_conf->rx_nseg;
>   
>   		if (rx_conf->offloads & RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT) {
> -			ret = rte_eth_rx_queue_check_split(rx_seg, n_seg,
> +			ret = rte_eth_rx_queue_check_split(port_id, rx_seg, n_seg,
>   							   &mbp_buf_size,
>   							   &dev_info);
>   			if (ret != 0)
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index c58c908c3a..410fba5eab 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1175,6 +1175,9 @@ struct rte_eth_txmode {
>    *   specified in the first array element, the second buffer, from the
>    *   pool in the second element, and so on.
>    *
> + * - The proto_hdrs in the elements define the split position of
> + *   received packets.
> + *
>    * - The offsets from the segment description elements specify
>    *   the data offset from the buffer beginning except the first mbuf.
>    *   The first segment offset is added with RTE_PKTMBUF_HEADROOM.
> @@ -1196,12 +1199,24 @@ struct rte_eth_txmode {
>    *     - pool from the last valid element
>    *     - the buffer size from this pool
>    *     - zero offset
> + *
> + * - Length based buffer split:
> + *     - mp, length, offset should be configured.
> + *     - The proto_hdr field will be ignored.
> + *
> + * - Protocol header based buffer split:
> + *     - mp, offset, proto_hdr should be configured.
> + *     - The length field will be ignored.
>    */
>   struct rte_eth_rxseg_split {
>   	struct rte_mempool *mp; /**< Memory pool to allocate segment from. */
>   	uint16_t length; /**< Segment data length, configures split point. */
>   	uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */
> -	uint32_t reserved; /**< Reserved field. */
> +	/**
> +	 * Supported ptype of a specific pmd, configures split point.
> +	 * It should be defined by RTE_PTYPE_*.
> +	 */
> +	uint32_t proto_hdr;
>   };
>   
>   /**



More information about the dev mailing list