[dpdk-dev] [PATCH v8 1/6] ethdev: introduce Rx buffer split

Slava Ovsiienko viacheslavo at nvidia.com
Fri Oct 16 11:22:59 CEST 2020


> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit at intel.com>
> Sent: Friday, October 16, 2020 12:19
> To: Slava Ovsiienko <viacheslavo at nvidia.com>; dev at dpdk.org
> Cc: NBU-Contact-Thomas Monjalon <thomas at monjalon.net>;
> stephen at networkplumber.org; olivier.matz at 6wind.com;
> jerinjacobk at gmail.com; maxime.coquelin at redhat.com;
> david.marchand at redhat.com; arybchenko at solarflare.com
> Subject: Re: [PATCH v8 1/6] ethdev: introduce Rx buffer split
> 
> On 10/16/2020 8:48 AM, Viacheslav Ovsiienko wrote:
> > The DPDK datapath in the transmit direction is very flexible.
> > An application can build the multi-segment packet and manages almost
> > all data aspects - the memory pools where segments are allocated from,
> > the segment lengths, the memory attributes like external buffers,
> > registered for DMA, etc.
> >
> > In the receiving direction, the datapath is much less flexible, an
> > application can only specify the memory pool to configure the
> > receiving queue and nothing more. In order to extend receiving
> > datapath capabilities it is proposed to add the way to provide
> > extended information how to split the packets being received.
> >
> > The new offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in device
> > capabilities is introduced to present the way for PMD to report to
> > application about supporting Rx packet split to configurable segments.
> > Prior invoking the rte_eth_rx_queue_setup() routine application should
> > check RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag.
> >
> > The following structure is introduced to specify the Rx packet segment
> > for RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT offload:
> >
> > struct rte_eth_rxseg_split {
> >
> >      struct rte_mempool *mp; /* memory pools to allocate segment from */
> >      uint16_t length; /* segment maximal data length,
> > 		       	configures "split point" */
> >      uint16_t offset; /* data offset from beginning
> > 		       	of mbuf data buffer */
> >      uint32_t reserved; /* reserved field */ };
> >
> > The segment descriptions are added to the rte_eth_rxconf structure:
> >     rx_seg - pointer the array of segment descriptions, each element
> >               describes the memory pool, maximal data length, initial
> >               data offset from the beginning of data buffer in mbuf.
> > 	     This array allows to specify the different settings for
> > 	     each segment in individual fashion.
> >     rx_nseg - number of elements in the array
> >
> > If the extended segment descriptions is provided with these new fields
> > the mp parameter of the rte_eth_rx_queue_setup must be specified as
> > NULL to avoid ambiguity.
> >
> > There are two options to specify Rx buffer configuration:
> > - mp is not NULL, rx_conf.rx_seg is NULL, rx_conf.rx_nseg is zero,
> >    it is compatible configuration, follows existing implementation,
> >    provides single pool and no description for segment sizes
> >    and offsets.
> > - mp is NULL, rx_conf.rx_seg is not NULL, rx_conf.rx_nseg is not
> >    zero, it provides the extended configuration, individually for
> >    each segment.
> >
> > f the Rx queue is configured with new settings the packets being
> > received will be split into multiple segments pushed to the mbufs with
> > specified attributes. The PMD will split the received packets into
> > multiple segments according to the specification in the description
> > array.
> >
> > For example, let's suppose we configured the Rx queue with the
> > following segments:
> >      seg0 - pool0, len0=14B, off0=2
> >      seg1 - pool1, len1=20B, off1=128B
> >      seg2 - pool2, len2=20B, off2=0B
> >      seg3 - pool3, len3=512B, off3=0B
> >
> > The packet 46 bytes long will look like the following:
> >      seg0 - 14B long @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
> >      seg1 - 20B long @ 128 in mbuf from pool1
> >      seg2 - 12B long @ 0 in mbuf from pool2
> >
> > The packet 1500 bytes long will look like the following:
> >      seg0 - 14B @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
> >      seg1 - 20B @ 128 in mbuf from pool1
> >      seg2 - 20B @ 0 in mbuf from pool2
> >      seg3 - 512B @ 0 in mbuf from pool3
> >      seg4 - 512B @ 0 in mbuf from pool3
> >      seg5 - 422B @ 0 in mbuf from pool3
> >
> > The offload RTE_ETH_RX_OFFLOAD_SCATTER must be present and
> configured
> > to support new buffer split feature (if rx_nseg is greater than one).
> >
> > The split limitations imposed by underlying PMD is reported in the new
> > introduced rte_eth_dev_info->rx_seg_capa field.
> >
> > The new approach would allow splitting the ingress packets into
> > multiple parts pushed to the memory with different attributes.
> > For example, the packet headers can be pushed to the embedded data
> > buffers within mbufs and the application data into the external
> > buffers attached to mbufs allocated from the different memory pools.
> > The memory attributes for the split parts may differ either - for
> > example the application data may be pushed into the external memory
> > located on the dedicated physical device, say GPU or NVMe. This would
> > improve the DPDK receiving datapath flexibility with preserving
> > compatibility with existing API.
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
> > Acked-by: Ajit Khaparde <ajit.khaparde at broadcom.com>
> > Acked-by: Jerin Jacob <jerinj at marvell.com>
> 
> <...>
> 
> > +/**
> >    * A structure used to configure an RX ring of an Ethernet port.
> >    */
> >   struct rte_eth_rxconf {
> > @@ -977,6 +998,46 @@ struct rte_eth_rxconf {
> >   	uint16_t rx_free_thresh; /**< Drives the freeing of RX descriptors. */
> >   	uint8_t rx_drop_en; /**< Drop packets if no descriptors are available.
> */
> >   	uint8_t rx_deferred_start; /**< Do not start queue with
> > rte_eth_dev_start(). */
> > +	uint16_t rx_nseg; /**< Number of descriptions in rx_seg array. */
> > +	/**
> > +	 * Points to the array of segment descriptions. Each array element
> > +	 * describes the properties for each segment in the receiving
> > +	 * buffer according to feature descripting structure.
> > +	 *
> > +	 * The supported capabilities of receiving segmentation is reported
> > +	 * in rte_eth_dev_info ->rx_seg_capa field.
> > +	 *
> > +	 * If RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag is set in offloads field,
> > +	 * the PMD will split the received packets into multiple segments
> > +	 * according to the specification in the description array:
> > +	 *
> > +	 * - the first network buffer will be allocated from the memory pool,
> > +	 *   specified in the first array element, the second buffer, from the
> > +	 *   pool in the second element, and so on.
> > +	 *
> > +	 * - the offsets from the segment description elements specify
> > +	 *   the data offset from the buffer beginning except the first mbuf.
> > +	 *   For this one the offset is added with RTE_PKTMBUF_HEADROOM.
> > +	 *
> > +	 * - the lengths in the elements define the maximal data amount
> > +	 *   being received to each segment. The receiving starts with filling
> > +	 *   up the first mbuf data buffer up to specified length. If the
> > +	 *   there are data remaining (packet is longer than buffer in the first
> > +	 *   mbuf) the following data will be pushed to the next segment
> > +	 *   up to its own length, and so on.
> > +	 *
> > +	 * - If the length in the segment description element is zero
> > +	 *   the actual buffer size will be deduced from the appropriate
> > +	 *   memory pool properties.
> > +	 *
> > +	 * - if there is not enough elements to describe the buffer for entire
> > +	 *   packet of maximal length the following parameters will be used
> > +	 *   for the all remaining segments:
> > +	 *     - pool from the last valid element
> > +	 *     - the buffer size from this pool
> > +	 *     - zero offset
> > +	 */
> > +	struct rte_eth_rxseg *rx_seg;
> 
> "struct rte_eth_rxconf" is very commonly used, I think all applications does the
> 'rte_eth_rx_queue_setup()', but "buffer split" is not a common usage,
> 
> I am against the "struct rte_eth_rxseg *rx_seg;" field creating this much noise
> in the "struct rte_eth_rxconf" documentation.
> As mentioned before, can you please move the above detailed documentation
> to where "struct rte_eth_rxseg" defined, and in this struct put a single
> comment for "struct rte_eth_rxseg *rx_seg" ?

Sure, we had doubts about putting this wordy comment to the rxconf structure either.
Now we can move the comment to the rte_eth_rxseg_split declaration.

With best regards, Slava


 


More information about the dev mailing list