[RFC v2 3/7] ethdev: introduce Rx queue based limit watermark
Andrew Rybchenko
andrew.rybchenko at oktetlabs.ru
Wed May 25 16:23:41 CEST 2022
On 5/25/22 16:58, Thomas Monjalon wrote:
> 25/05/2022 14:59, Andrew Rybchenko:
>> On 5/24/22 11:18, Thomas Monjalon wrote:
>>> 24/05/2022 04:50, Spike Du:
>>>> From: Thomas Monjalon <thomas at monjalon.net>
>>>>> 23/05/2022 05:01, Spike Du:
>>>>>> From: Stephen Hemminger <stephen at networkplumber.org>
>>>>>>> Spike Du <spiked at nvidia.com> wrote:
>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>>>> @@ -1249,7 +1249,16 @@ struct rte_eth_rxconf {
>>>>>>>> */
>>>>>>>> union rte_eth_rxseg *rx_seg;
>>>>>>>>
>>>>>>>> - uint64_t reserved_64s[2]; /**< Reserved for future fields */
>>>>>>>> + /**
>>>>>>>> + * Per-queue Rx limit watermark defined as percentage of Rx queue
>>>>>>>> + * size. If Rx queue receives traffic higher than this percentage,
>>>>>>>> + * the event RTE_ETH_EVENT_RX_LWM is triggered.
>>>>>>>> + */
>>>>>>>> + uint8_t lwm;
>>>>>>>> +
>>>>>>>> + uint8_t reserved_bits[3];
>>>>>>>> + uint32_t reserved_32s;
>>>>>>>> + uint64_t reserved_64s;
>>>>>>>
>>>>>>> Ok but, this is an ABI risk about this because reserved stuff was
>>>>>>> never required before.
>>>>>
>>>>> An ABI compatibility issue would be for an application compiled with an old
>>>>> DPDK, and loading a new DPDK at runtime.
>>>>> Let's think what would happen in such a case.
>>>>>
>>>>>>> Whenever is a reserved field is introduced the code (in this case
>>>>>>> rte_ethdev_configure).
>>>>>
>>>>> rte_eth_rx_queue_setup() is called with rx_conf->lwm not initialized.
>>>>> Then the library and drivers may interpret a wrong value.
>>>>>
>>>>>>> Best practice would have been to have the code require all reserved
>>>>>>> fields be
>>>>>>> 0 in earlier releases. In this case an application is like to define
>>>>>>> a watermark of zero; how will your code handle it.
>>>>>>
>>>>>> Having watermark of 0 is desired, which is the default. LWM of 0 means
>>>>>> the Rx Queue's watermark is not monitored, hence no LWM event is
>>>>> generated.
>>>>>
>>>>> The problem is to have a value not initialized.
>>>>> I think the best approach is to not expose the LWM value through this
>>>>> configuration structure.
>>>>> If the need is to get the current value, we should better add a field in the
>>>>> struct rte_eth_rxq_info.
>>>>
>>>> At least from all the dpdk app/example code, rxconf is initialized to 0 then setup
>>>> The Rx queue, if user follows these examples we should not have ABI issue.
>>>> Since many people are concerned about rxconf change, it's ok to remove the LWM
>>>> Field there.
>>>> Yes, I think we can add lwm into rte_eth_rxq_info. If we can set Rx queue's attribute,
>>>> We should have a way to get it.
>>>
>>> Unfortunately we cannot rely on examples for ABI compatibility.
>>> My suggestion of moving the field in rte_eth_rxq_info
>>> is not obvious because it could change the size of the struct.
>>> But thanks to __rte_cache_min_aligned, it is OK.
>>> Running pahole on this struct shows we have 50 bytes free:
>>> /* size: 128, cachelines: 2, members: 6 */
>>> /* padding: 50 */
>>>
>>> The other option would be to get the LWM value with a "get" function.
>>>
>>> What others prefer?
>>
>> If I'm not mistaken the changeset breaks ABI in any case since
>> it adds a new event and changes MAX.
>
> I think we can consider it as not a breakage (a rule should be added).
> Last time we had to update this enum, this was the conclusion:
> from https://git.dpdk.org/dpdk/commit/?id=44bf3c796be3f
> "
> The new event type addition in the enum is flagged as an ABI breakage,
> so an ignore rule is added for these reasons:
> - It is not changing value of existing types (except MAX)
> - The new value is not used by existing API if the event is not
> registered
> In general, it is safe adding new ethdev event types at the end of the
> enum, because of event callback registration mechanism.
> "
I see. Makes sense. Thanks for the information.
>> If so, I'd wait for the
>> next ABI breaking release and do not touch reserved fields.
>
> In any case, rte_eth_rxconf is not a good fit
> because we have a separate function for configuration.
Yes, it is better to avoid two ways to configure the same
thing.
> It should be either in rte_eth_rxq_info or a specific "get" function.
I see no point to introduce specific get function for a single
value. I think that rte_eth_rxq_info is the right way to get
current value.
More information about the dev
mailing list