[dpdk-dev] [RFC] ethdev: introduce Rx buffer split

Ferruh Yigit ferruh.yigit at intel.com
Tue Oct 13 23:59:21 CEST 2020


On 10/12/2020 10:56 AM, Slava Ovsiienko wrote:
> Hi, Andrew
> 
> Thank you for the comments.
> 
> We have two approaches how to specify multiple segments to split Rx packets:
> 1. update queue configuration structure
> 2. introduce new rx_queue_setup_ex() routine with extra parameters.
> 
> For [1] my only actual dislike is that we would have multiple places to specify
> the pool - in rx_queue_setup() and in the config structure. So, we should
> implement some checking (if we have offload flag set we should check
> whether mp parameter is NULL and segment descriptions array pointer/size
> is provided, if no offload flag set - we must check the description array is empty).
> 
>> @Thomas, @Ferruh: I'd like to hear what other ethdev maintainers think
>> about it.
> 
> Yes, it would be very nice to hear extra opinions. Do we think the providing
> of extra API function is worse than extending existing structure, introducing
> some conditional ambiguity and complicating the parameter compliance
> check?
> 

I think decision was given with the deprecation notice which already says 
``rte_eth_rxconf`` will be updated for this.

With new API, we need to create a new dev_ops too, not sure about creating a new 
dev_ops for a single PMD.

For the PMD that supports this feature will need two dev_ops that is fairly 
close to each other, as Andrew mentioned this is a duplication.

And from user perspective two setup functions with overlaps can be confusing.

+1 to having single setup function but update the config, and I can see v5 sent 
this way, I will check it.


> Now I'm updating the existing version on the patch based on rx_queue_ex()
> and then could prepare the version for configuration structure,
> it is not a problem - approaches are very similar, we just should choose
> the most relevant one.
> 
> With best regards, Slava
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <Andrew.Rybchenko at oktetlabs.ru>
>> Sent: Monday, October 12, 2020 11:45
>> To: Slava Ovsiienko <viacheslavo at nvidia.com>; dev at dpdk.org
>> Cc: Thomas Monjalon <thomasm at mellanox.com>;
>> stephen at networkplumber.org; ferruh.yigit at intel.com; Shahaf Shuler
>> <shahafs at nvidia.com>; olivier.matz at 6wind.com; jerinjacobk at gmail.com;
>> maxime.coquelin at redhat.com; david.marchand at redhat.com; Asaf Penso
>> <asafp at nvidia.com>
>> Subject: Re: [dpdk-dev] [RFC] ethdev: introduce Rx buffer split
>>
>> Hi Slava,
>>
>> I'm sorry for late reply. See my notes below.
>>
>> On 10/1/20 11:54 AM, Slava Ovsiienko wrote:
>>> Hi, Andrew
>>>
>>> Thank you for the comments, please see my replies below.
>>>
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <arybchenko at solarflare.com>
>>>> Sent: Thursday, September 17, 2020 19:55
>>>> To: Slava Ovsiienko <viacheslavo at nvidia.com>; dev at dpdk.org
>>>> Cc: Thomas Monjalon <thomasm at mellanox.com>;
>>>> stephen at networkplumber.org; ferruh.yigit at intel.com; Shahaf Shuler
>>>> <shahafs at nvidia.com>; olivier.matz at 6wind.com;
>> jerinjacobk at gmail.com;
>>>> maxime.coquelin at redhat.com; david.marchand at redhat.com; Asaf
>> Penso
>>>> <asafp at nvidia.com>
>>>> Subject: Re: [dpdk-dev] [RFC] ethdev: introduce Rx buffer split
>>>>
>>> [snip]
>>>>>
>>>>> For example, let's suppose we configured the Rx queue with the
>>>>> following segments:
>>>>> seg0 - pool0, len0=14B, off0=RTE_PKTMBUF_HEADROOM
>>>>> seg1 - pool1, len1=20B, off1=0B
>>>>> seg2 - pool2, len2=20B, off2=0B
>>>>> seg3 - pool3, len3=512B, off3=0B
>>>>>
>>>>> The packet 46 bytes long will look like the following:
>>>>> seg0 - 14B long @ RTE_PKTMBUF_HEADROOM in mbuf from pool0
>>>>> seg1 - 20B long @ 0 in mbuf from pool1
>>>>> seg2 - 12B long @ 0 in mbuf from pool2
>>>>>
>>>>> The packet 1500 bytes long will look like the following:
>>>>> seg0 - 14B @ RTE_PKTMBUF_HEADROOM in mbuf from pool0
>>>>> seg1 - 20B @ 0 in mbuf from pool1
>>>>> seg2 - 20B @ 0 in mbuf from pool2
>>>>> seg3 - 512B @ 0 in mbuf from pool3
>>>>> seg4 - 512B @ 0 in mbuf from pool3
>>>>> seg5 - 422B @ 0 in mbuf from pool3
>>>>
>>>> The behaviour is logical, but what to do if HW can't do it, i.e. use
>>>> the last segment many times. Should it reject configuration if
>>>> provided segments are insufficient to fit MTU packet? How to report
>>>> the limitation?
>>>> (I'm still trying to convince that SCATTER and BUFFER_SPLIT should be
>>>> independent).
>>>
>>> BUFFER_SPLIT is rather the way to tune SCATTER. Currently scattering
>>> happens on unconditional mbuf data buffer boundaries (we have reserved
>>> HEAD space in the first mbuf and fill this one to the buffer end, the
>>> next mbuf buffers might be filled completely). BUFFER_SPLIT provides
>>> the way to specify the desired points to split packet, not just
>>> blindly follow buffer boundaries. There is the check inplemented in
>>> common part if each split segment fits the mbuf allocated from
>> appropriate pool.
>>> PMD should do extra check internally whether it supports the requested
>>> split settings, if not - call will be rejected.
>>>
>>
>> @Thomas, @Ferruh: I'd like to hear what other ethdev maintainers think
>> about it.
>>
>>> [snip]
>>>>
>>>> I dislike the idea to introduce new device operation.
>>>> rte_eth_rxconf has reserved space and BUFFER_SPLIT offload will mean
>>>> that PMD looks at the split configuration location there.
>>>>
>>> We considered the approach of pushing split setting to the rxconf
>>> structure.
>>>
>> [https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatc
>>>
>> hes.dpdk.org%2Fpatch%2F75205%2F&data=02%7C01%7Cviacheslavo%
>> 40nvidi
>>>
>> a.com%7C97a49cb62028432610ea08d86e8b3283%7C43083d15727340c1b7
>> db39efd9c
>>>
>> cc17a%7C0%7C0%7C637380891414182285&sdata=liII5DHGlJAL8wEwV
>> Vika79tp
>>> 8R9faTZ0lXrlfvQGZE%3D&reserved=0]
>>> But it seems there are some issues:
>>>
>>> - the split configuration description requires the variable length
>>> array (due to variations in number of segments), so rte_eth_rxconf
>>> structure would have the variable length (not nice, IMO).
>>>
>>> We could push pointers to the array of rte_eth_rxseg, but we would
>>> lost the single structure (and contiguous memory) simplicity, this
>>> approach has no advantages over the specifying the split configuration
>>> as parameters of setup_ex().
>>>
>>
>> I think it has a huge advantage to avoid extra device operation.
>>
>>> - it would introduces the ambiguity, rte_eth_rx_queue_setup()
>>> specifies the single mbuf pool as parameter. What should we do with
>>> it? Set to NULL? Treat as the first pool? I would prefer to specify
>>> all split segments in uniform fashion, i.e. as array or rte_eth_rxseg
>>> structures (and it can be easily updated with some extra segment
>>> attributes if needed). So, in my opinion, we should remove/replace the
>>> pool parameter in rx_queue_setup (by introducing new func).
>>>
>>
>> I'm trying to resolve the ambiguity as described above (see BUFFER_SPLIT vs
>> SCATTER). Use the pointer for tail segments with respect to SCATTER
>> capability.
>>
>>> - specifying the new extended setup roiutine has an advantage that we
>>> should not update any PMDs code in part of existing implementations of
>>> rte_eth_rx_queue_setup().
>>
>> It is not required since it is controlled by the new offload flags. If the offload
>> is not supported, the new field is invisible for PMD (it simply ignores).
>>
>>>
>>> If PMD supports BUFFER_SPLIT (or other related feature) it just should
>>> provide
>>> rte_eth_rx_queue_setup_ex() and check the
>> DEV_RX_OFFLOAD_BUFFER_SPLIT
>>> (or HEADER_SPLIT, or ever feature) it supports. The common code does
>>> not check the feature flags - it is on PMDs' own. In order to
>>> configure PMD to perfrom arbitrary desired Rx spliting the application
>>> should check DEV_RX_OFFLOAD_BUFFER_SPLIT in port capabilites, if found
>>> - set DEV_RX_OFFLOAD_BUFFER_SPLIT in configuration and call
>>> rte_eth_rx_queue_setup_ex().
>>> And this approach can be followed for any other split related feature.
>>>
>>> With best regards, Slava
>>>
>>
> 



More information about the dev mailing list