[dpdk-dev] [PATCH v8 1/6] ethdev: introduce Rx buffer split

Andrew Rybchenko andrew.rybchenko at oktetlabs.ru
Fri Oct 16 11:21:37 CEST 2020


On 10/16/20 12:19 PM, Ferruh Yigit wrote:
> On 10/16/2020 8:48 AM, Viacheslav Ovsiienko wrote:
>> The DPDK datapath in the transmit direction is very flexible.
>> An application can build the multi-segment packet and manages
>> almost all data aspects - the memory pools where segments
>> are allocated from, the segment lengths, the memory attributes
>> like external buffers, registered for DMA, etc.
>>
>> In the receiving direction, the datapath is much less flexible,
>> an application can only specify the memory pool to configure the
>> receiving queue and nothing more. In order to extend receiving
>> datapath capabilities it is proposed to add the way to provide
>> extended information how to split the packets being received.
>>
>> The new offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in device
>> capabilities is introduced to present the way for PMD to report to
>> application about supporting Rx packet split to configurable
>> segments. Prior invoking the rte_eth_rx_queue_setup() routine
>> application should check RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag.
>>
>> The following structure is introduced to specify the Rx packet
>> segment for RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT offload:
>>
>> struct rte_eth_rxseg_split {
>>
>>      struct rte_mempool *mp; /* memory pools to allocate segment from */
>>      uint16_t length; /* segment maximal data length,
>>                    configures "split point" */
>>      uint16_t offset; /* data offset from beginning
>>                    of mbuf data buffer */
>>      uint32_t reserved; /* reserved field */
>> };
>>
>> The segment descriptions are added to the rte_eth_rxconf structure:
>>     rx_seg - pointer the array of segment descriptions, each element
>>               describes the memory pool, maximal data length, initial
>>               data offset from the beginning of data buffer in mbuf.
>>          This array allows to specify the different settings for
>>          each segment in individual fashion.
>>     rx_nseg - number of elements in the array
>>
>> If the extended segment descriptions is provided with these new
>> fields the mp parameter of the rte_eth_rx_queue_setup must be
>> specified as NULL to avoid ambiguity.
>>
>> There are two options to specify Rx buffer configuration:
>> - mp is not NULL, rx_conf.rx_seg is NULL, rx_conf.rx_nseg is zero,
>>    it is compatible configuration, follows existing implementation,
>>    provides single pool and no description for segment sizes
>>    and offsets.
>> - mp is NULL, rx_conf.rx_seg is not NULL, rx_conf.rx_nseg is not
>>    zero, it provides the extended configuration, individually for
>>    each segment.
>>
>> f the Rx queue is configured with new settings the packets being
>> received will be split into multiple segments pushed to the mbufs
>> with specified attributes. The PMD will split the received packets
>> into multiple segments according to the specification in the
>> description array.
>>
>> For example, let's suppose we configured the Rx queue with the
>> following segments:
>>      seg0 - pool0, len0=14B, off0=2
>>      seg1 - pool1, len1=20B, off1=128B
>>      seg2 - pool2, len2=20B, off2=0B
>>      seg3 - pool3, len3=512B, off3=0B
>>
>> The packet 46 bytes long will look like the following:
>>      seg0 - 14B long @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
>>      seg1 - 20B long @ 128 in mbuf from pool1
>>      seg2 - 12B long @ 0 in mbuf from pool2
>>
>> The packet 1500 bytes long will look like the following:
>>      seg0 - 14B @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
>>      seg1 - 20B @ 128 in mbuf from pool1
>>      seg2 - 20B @ 0 in mbuf from pool2
>>      seg3 - 512B @ 0 in mbuf from pool3
>>      seg4 - 512B @ 0 in mbuf from pool3
>>      seg5 - 422B @ 0 in mbuf from pool3
>>
>> The offload RTE_ETH_RX_OFFLOAD_SCATTER must be present and
>> configured to support new buffer split feature (if rx_nseg
>> is greater than one).
>>
>> The split limitations imposed by underlying PMD is reported
>> in the new introduced rte_eth_dev_info->rx_seg_capa field.
>>
>> The new approach would allow splitting the ingress packets into
>> multiple parts pushed to the memory with different attributes.
>> For example, the packet headers can be pushed to the embedded
>> data buffers within mbufs and the application data into
>> the external buffers attached to mbufs allocated from the
>> different memory pools. The memory attributes for the split
>> parts may differ either - for example the application data
>> may be pushed into the external memory located on the dedicated
>> physical device, say GPU or NVMe. This would improve the DPDK
>> receiving datapath flexibility with preserving compatibility
>> with existing API.
>>
>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
>> Acked-by: Ajit Khaparde <ajit.khaparde at broadcom.com>
>> Acked-by: Jerin Jacob <jerinj at marvell.com>
> 
> <...>
> 
>> +/**
>>    * A structure used to configure an RX ring of an Ethernet port.
>>    */
>>   struct rte_eth_rxconf {
>> @@ -977,6 +998,46 @@ struct rte_eth_rxconf {
>>       uint16_t rx_free_thresh; /**< Drives the freeing of RX
>> descriptors. */
>>       uint8_t rx_drop_en; /**< Drop packets if no descriptors are
>> available. */
>>       uint8_t rx_deferred_start; /**< Do not start queue with
>> rte_eth_dev_start(). */
>> +    uint16_t rx_nseg; /**< Number of descriptions in rx_seg array. */
>> +    /**
>> +     * Points to the array of segment descriptions. Each array element
>> +     * describes the properties for each segment in the receiving
>> +     * buffer according to feature descripting structure.
>> +     *
>> +     * The supported capabilities of receiving segmentation is reported
>> +     * in rte_eth_dev_info ->rx_seg_capa field.
>> +     *
>> +     * If RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag is set in offloads field,
>> +     * the PMD will split the received packets into multiple segments
>> +     * according to the specification in the description array:
>> +     *
>> +     * - the first network buffer will be allocated from the memory
>> pool,
>> +     *   specified in the first array element, the second buffer,
>> from the
>> +     *   pool in the second element, and so on.
>> +     *
>> +     * - the offsets from the segment description elements specify
>> +     *   the data offset from the buffer beginning except the first
>> mbuf.
>> +     *   For this one the offset is added with RTE_PKTMBUF_HEADROOM.
>> +     *
>> +     * - the lengths in the elements define the maximal data amount
>> +     *   being received to each segment. The receiving starts with
>> filling
>> +     *   up the first mbuf data buffer up to specified length. If the
>> +     *   there are data remaining (packet is longer than buffer in
>> the first
>> +     *   mbuf) the following data will be pushed to the next segment
>> +     *   up to its own length, and so on.
>> +     *
>> +     * - If the length in the segment description element is zero
>> +     *   the actual buffer size will be deduced from the appropriate
>> +     *   memory pool properties.
>> +     *
>> +     * - if there is not enough elements to describe the buffer for
>> entire
>> +     *   packet of maximal length the following parameters will be used
>> +     *   for the all remaining segments:
>> +     *     - pool from the last valid element
>> +     *     - the buffer size from this pool
>> +     *     - zero offset
>> +     */
>> +    struct rte_eth_rxseg *rx_seg;
> 
> "struct rte_eth_rxconf" is very commonly used, I think all applications
> does the 'rte_eth_rx_queue_setup()', but "buffer split" is not a common
> usage,
> 
> I am against the "struct rte_eth_rxseg *rx_seg;" field creating this
> much noise in the "struct rte_eth_rxconf" documentation.
> As mentioned before, can you please move the above detailed
> documentation to where "struct rte_eth_rxseg" defined, and in this
> struct put a single comment for "struct rte_eth_rxseg *rx_seg" ?

+1


More information about the dev mailing list