[dpdk-dev] [RFC] ethdev: add a field for rte_eth_rxq_info

Andrew Rybchenko arybchenko at solarflare.com
Thu Jun 25 11:06:18 CEST 2020


On 6/24/20 9:32 PM, Ferruh Yigit wrote:
> On 6/24/2020 9:52 AM, Ferruh Yigit wrote:
>> On 6/24/2020 4:48 AM, Chengchang Tang wrote:
>>>
>>> On 2020/6/23 17:30, Andrew Rybchenko wrote:
>>>> On 6/23/20 9:48 AM, Chengchang Tang wrote:
>>>>> In common practice, PMD configure the rx_buf_size according to the data
>>>>> room size of the object in mempool. But in fact the final value is related
>>>>> to the specifications of hw, and its values will affect the number of
>>>>> fragments in recieving pkts.
>>>>>
>>>>> At present, we seem to have no way to espose relevant information to upper
>>>>> layer users.
>>>>>
>>>>> Add a field named rx_bufsize in rte_eth_rxq_info to indicate the buffer
>>>>> size used in recieving pkts for hw.
>>>>>
>>>>
>>>> I'm OK with the change in general.
>>>> I'm unsure which name to use: 'rx_buf_size' or 'rx_bursize',
>>>> since I found both 'min_rx_buf_size' and 'min_rx_bufsize' in
>>>> ethdev.
>>>>
>>>> I think it is important to update PMDs which provides the
>>>> information to fill the field in.
>>>
>>> My plan is to divide the subsequent series into two patches,
>>> one to modify rte_eth_rxq_info, and one to add our hns3 PMD
>>> implementation of rxq_info_get. Should i update all the PMDs
>>> that provide this information and test programs such as
>>> testpmd at the same time?
>>
>> Hi Chengchang, Andrew,
>>
>> No objection to the change, but it should be crystal clear what is added. These
>> are for PMD developers to implement and when it is not clear we end up having
>> different implementations and inconsistencies.
>>
>> There is already some confusion for the Rx packet size etc.. my concern is
>> adding more to it, here all we have is "size of RX buffer." comment, I think we
>> need more.
> 
> cc'ed a few more people.
> 
> Back to this favorite topic, how to configure/limit the packet size.
> 
> Can you please help to have a common/correct understanding? I tried to clarify
> as much as I got it, any comment welcome. (I know it is long, please bare with me)
> 
> 
> The related config options I can see,
> 1) rte_eth_conf->rxmode->max_rx_pkt_len
> 2) rte_eth_dev_info->max_rx_pktlen
> 3) DEV_RX_OFFLOAD_JUMBO_FRAME
> 4) rte_eth_dev->data->mtu
> 5) DEV_RX_OFFLOAD_SCATTER
> 6) dev->data->scattered_rx
> 7) rte_eth_dev_info->min_mtu, rte_eth_dev_info->max_mtu
> 
> 'mtu' (4): Both for Tx and Rx. The network layer payload length. Default value
> 'RTE_ETHER_MTU'.
> 
> 'max_rx_pkt_len' (1): Only for Rx, maximum Rx frame length configured by
> application.
> 
> 'max_rx_pktlen' (2): Device reported value on what maximum Rx frame length it
> can receive. Application shouldn't set Rx frame length more than this value.
> 
> 'DEV_RX_OFFLOAD_JUMBO_FRAME' (3): Device Jumbo Frame capability.
> 	When not enabled the Rx frame length is 'MTU' + overhead
> 	When enabled Rx frame length is 'max_rx_pkt_len'
> 
> 'DEV_RX_OFFLOAD_SCATTER' (5): Capability to scatter packet to multiple
> descriptor by device and driver converting this to chained mbuf.
> 
> 'dev->data->scattered_rx' (6): The current status of driver scattered Rx, in
> device data mostly for PMD internal usage.
> 
> 'rte_eth_dev_info->min_mtu' & 'rte_eth_dev_info->max_mtu' (7): minimum and
> maximum MTU values device supported.
> 'max_mtu' == 'max_rx_pkt_len' - L2_OVERHEAD.
> 
> 
> I can see two different limits,
> a) The Rx frame length limit that device can receive from wire. Any packet
> larger than this size will be dropped by device in an early stage.
> b) The Rx buffer length limit that received packets are written to. Device
> shouldn't DMA larger than reserved buffer size.
> 
> If device supports scattered Rx to multiple descriptors, it can be possible to
> configure (a) > (b).
> Otherwise configuration have to be (b) >= (a).
> 
> For example if the mbuf size is 2Kb and the device can receive up to 9000 bytes.
> Options are:
> - If device supports it, large packet will be scattered on multiple mbufs
> - or need to configure device Rx frame length to 2K (mbuf size)
> - or need to allocate mbuf big enough to get largest possible packet (9000)
> 
> 
> 
> Issues I see:
> -------------
> 
> i) Although the code clearly says 'max_rx_pkt_len' is only valid when jumbo
> frames enabled, some drivers are taking it account always.

Ack, that's not good.

> ii) Some drivers enable 'scattered_rx' & 'jumbo frame' implicitly, without
> having 'DEV_RX_OFFLOAD_JUMBO_FRAME' or 'DEV_RX_OFFLOAD_SCATTER' requested by
> application.

Ack that it is a problem especially for scatter.

> iii) Having both 'mtu' & 'max_rx_pkt_len' are confusing, although they are not
> exactly same thing they are related. Difference is MTU applies for Tx too, and
> L2 network layer overhead is not included.
> 'MTU' can be more interested by upper layers, 'max_rx_pkt_len' is more driver
> level information. And driver should know how to convert one to another.

Agree

> iv) 'max_rx_pkt_len' provided as part of 'rte_eth_dev_configure()' and there is
> no API to update it later.
> 'mtu' is not part of 'rte_eth_dev_configure()', it can only be updated later
> with specific API.
> But driver have to keep these two values consistent.

Agree

> v) 'max_rx_pktlen' & 'max_mtu' reports from driver are redundant information.
> Again they are not same thing, but correlated.

Agree

> 
> Suggested changes:
> -----------------
> 
> Overall unify 'max_rx_pkt_len' & 'mtu' as much as possible, at first step:
> 
> i) Make 'max_rx_pkt_len' independent from 'DEV_RX_OFFLOAD_JUMBO_FRAME', so
> 'max_rx_pkt_len' value will be always valid, jumbo frame enabled or not.

It will make handling of the max_rx_pkt_len mandatory in
all network PMDs.

> 
> ii) in '.dev_configure' convert 'max_rx_pkt_len' value to 'mtu' value, this will
> be only point 'max_rx_pkt_len' is used, after that point PMD will always use
> 'mtu' value.

I'm not sure that it is a right direction.
Above you say that 'max_rx_pkt_len' is more driver level and
I agree with it. I guess most drivers operate it finally
(i.e. configure underlying HW in terms of max_rx_pkt_len,
not MTU). So, converted from max_rx_pkt_len to MTU on ethdev
level and covered back from MTU to max_rx_pkt_len in drivers.

> Even don't reflect 'rte_eth_dev_set_mtu()' changes to 'max_rx_pkt_len' anymore.

Not sure that I get it. max_rx_pkt_len is used on dev_configure
only. Is it reported on get somewhere?

> 
> iii) Don't make 'max_rx_pkt_len' a mandatory config option, let it be '0' by
> application, in that case 'rte_eth_dev_configure()' will set
> "'max_rx_pkt_len' = RTE_ETHER_MAX_LEN" if 'DEV_RX_OFFLOAD_JUMBO_FRAME' disabled
> "'max_rx_pkt_len' = 9000 if 'DEV_RX_OFFLOAD_JUMBO_FRAME' enabled

Why 9000? IMHO, if max_rx_pkt_len is 0, just use value derived
from MTU.

> 
> iv) Allow implicit update of 'DEV_RX_OFFLOAD_JUMBO_FRAME' on MTU set, since
> setting a large MTU implies the jumbo frame request. And there is no harm to
> application.

Yes and I'd deprecate DEV_RX_OFFLOAD_JUMBO_FRAME.

> 
> v) Do NOT allow implicit update of 'DEV_RX_OFFLOAD_SCATTER' on MTU set (when Rx
> frame length > Rx buffer length), since application may not be capable of
> parsing chained mbufs. Instead fails the MTU set in that case.
> [This can break some applications, relying on this implicit set.]

Yes, definitely.

> 
> 
> Any comments?
> 
> 
> 
> Additional details:
> -------------------
> 
> Behavior of some drivers:
> 
> What igb & ixgbe does
> - Set Rx frame limit (a) using 'max_rx_pkt_len' (1)
> - Set Rx buffer limit (b) using mbuf data size
> - Enable Scattered Rx (5 & 6) if the Rx frame limit (a) bigger than Rx buffer
> limit (b) (even user not requested for it)
> 
> What i40e does same as above, only differences
> - Return error if jumbo frame enabled and 'max_rx_pkt_len' < RTE_ETHER_MAX_LEN
> 
> sfc:
> - Set Rx frame limit (a)
>   - using 'max_rx_pkt_len' (1) when jumbo frame enabled
>   - using 'mtu' when jumbo frame not enabled.
> - Set Rx buffer limit (b) using mbuf data size
> - If Rx frame limit (a) bigger than Rx buffer limit (b), and user not requested
> 'DEV_RX_OFFLOAD_SCATTER', return error.

Ack

> 
> octeontx2:
> - Set Rx frame limit (a) using 'max_rx_pkt_len' (1). Implicitly enable jumbo
> frame based on 'max_rx_pkt_len'.
> - I can't able find how Rx buffer limit (b) set
> - Enable Scattered Rx (5) if the Rx frame limit (a) bigger than Rx buffer limit
> (b) (even user not requested for it). 'dev->data->scattered_rx' not set at all.
> 
> 
>> Adding a PMD implementation and testpmd updates helps to clarify the
>> intention/usage, so I suggest sending them as a single patch with this one.
>>
>> Updating all PMDs is a bigger ask and sometimes too hard because of lack of
>> knowledge on the internals of other PMDs, although this is causing feature gaps
>> time to time, we are not mandating this to developers, so please update as many
>> PMD as you can, that you are confident, rest should be done by their maintainers.
>>
>>>>
>>>>> Signed-off-by: Chengchang Tang <tangchengchang at huawei.com>
>>>>
>>>> Acked-by: Andrew Rybchenko <arybchenko at solarflare.com>
>>>>
>>>>> ---
>>>>>  lib/librte_ethdev/rte_ethdev.h | 1 +
>>>>>  1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
>>>>> index 0f6d053..82b7e98 100644
>>>>> --- a/lib/librte_ethdev/rte_ethdev.h
>>>>> +++ b/lib/librte_ethdev/rte_ethdev.h
>>>>> @@ -1306,6 +1306,7 @@ struct rte_eth_rxq_info {
>>>>>  	struct rte_eth_rxconf conf; /**< queue config parameters. */
>>>>>  	uint8_t scattered_rx;       /**< scattered packets RX supported. */
>>>>>  	uint16_t nb_desc;           /**< configured number of RXDs. */
>>>>> +	uint16_t rx_bufsize;        /**< size of RX buffer. */
>>>>>  } __rte_cache_min_aligned;
>>>>>
>>>>>  /**


More information about the dev mailing list