[dpdk-dev] [PATCH v4 3/5] net/ice: fix max mtu size packets with vlan tag cannot be received by default

Ferruh Yigit ferruh.yigit at intel.com
Mon Oct 19 15:07:56 CEST 2020


On 10/19/2020 11:49 AM, Ananyev, Konstantin wrote:
> 
>>> -----Original Message-----
>>> From: Ferruh Yigit <ferruh.yigit at intel.com>
>>> Sent: Wednesday, October 14, 2020 11:38 PM
>>> To: Zhang, Qi Z <qi.z.zhang at intel.com>; Yang, SteveX
>>> <stevex.yang at intel.com>; Ananyev, Konstantin
>>> <konstantin.ananyev at intel.com>; dev at dpdk.org
>>> Cc: Zhao1, Wei <wei.zhao1 at intel.com>; Guo, Jia <jia.guo at intel.com>; Yang,
>>> Qiming <qiming.yang at intel.com>; Wu, Jingjing <jingjing.wu at intel.com>;
>>> Xing, Beilei <beilei.xing at intel.com>; Stokes, Ian <ian.stokes at intel.com>
>>> Subject: Re: [dpdk-dev] [PATCH v4 3/5] net/ice: fix max mtu size packets
>>> with vlan tag cannot be received by default
>>>
>>> On 9/30/2020 3:32 AM, Zhang, Qi Z wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Yang, SteveX <stevex.yang at intel.com>
>>>>> Sent: Wednesday, September 30, 2020 9:32 AM
>>>>> To: Zhang, Qi Z <qi.z.zhang at intel.com>; Ananyev, Konstantin
>>>>> <konstantin.ananyev at intel.com>; dev at dpdk.org
>>>>> Cc: Zhao1, Wei <wei.zhao1 at intel.com>; Guo, Jia <jia.guo at intel.com>;
>>>>> Yang, Qiming <qiming.yang at intel.com>; Wu, Jingjing
>>>>> <jingjing.wu at intel.com>; Xing, Beilei <beilei.xing at intel.com>
>>>>> Subject: RE: [PATCH v4 3/5] net/ice: fix max mtu size packets with
>>>>> vlan tag cannot be received by default
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Zhang, Qi Z <qi.z.zhang at intel.com>
>>>>>> Sent: Wednesday, September 30, 2020 8:35 AM
>>>>>> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Yang,
>>> SteveX
>>>>>> <stevex.yang at intel.com>; dev at dpdk.org
>>>>>> Cc: Zhao1, Wei <wei.zhao1 at intel.com>; Guo, Jia <jia.guo at intel.com>;
>>>>>> Yang, Qiming <qiming.yang at intel.com>; Wu, Jingjing
>>>>>> <jingjing.wu at intel.com>; Xing, Beilei <beilei.xing at intel.com>
>>>>>> Subject: RE: [PATCH v4 3/5] net/ice: fix max mtu size packets with
>>>>>> vlan tag cannot be received by default
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Ananyev, Konstantin <konstantin.ananyev at intel.com>
>>>>>>> Sent: Wednesday, September 30, 2020 7:02 AM
>>>>>>> To: Zhang, Qi Z <qi.z.zhang at intel.com>; Yang, SteveX
>>>>>>> <stevex.yang at intel.com>; dev at dpdk.org
>>>>>>> Cc: Zhao1, Wei <wei.zhao1 at intel.com>; Guo, Jia <jia.guo at intel.com>;
>>>>>>> Yang, Qiming <qiming.yang at intel.com>; Wu, Jingjing
>>>>>>> <jingjing.wu at intel.com>; Xing, Beilei <beilei.xing at intel.com>
>>>>>>> Subject: RE: [PATCH v4 3/5] net/ice: fix max mtu size packets with
>>>>>>> vlan tag cannot be received by default
>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Yang, SteveX <stevex.yang at intel.com>
>>>>>>>>> Sent: Monday, September 28, 2020 2:56 PM
>>>>>>>>> To: dev at dpdk.org
>>>>>>>>> Cc: Zhao1, Wei <wei.zhao1 at intel.com>; Guo, Jia
>>>>>>>>> <jia.guo at intel.com>; Yang, Qiming <qiming.yang at intel.com>;
>>> Zhang,
>>>>>>>>> Qi Z <qi.z.zhang at intel.com>; Wu, Jingjing
>>>>>>>>> <jingjing.wu at intel.com>; Xing, Beilei <beilei.xing at intel.com>;
>>>>>>>>> Ananyev, Konstantin <konstantin.ananyev at intel.com>; Yang,
>>> SteveX
>>>>>>>>> <stevex.yang at intel.com>
>>>>>>>>> Subject: [PATCH v4 3/5] net/ice: fix max mtu size packets with
>>>>>>>>> vlan tag cannot be received by default
>>>>>>>>>
>>>>>>>>> testpmd will initialize default max packet length to 1518 which
>>>>>>>>> doesn't include vlan tag size in ether overheader. Once, send the
>>>>>>>>> max mtu length packet with vlan tag, the max packet length will
>>>>>>>>> exceed 1518 that will cause packets dropped directly from NIC hw
>>>>> side.
>>>>>>>>>
>>>>>>>>> ice can support dual vlan tags that need more 8 bytes for max
>>>>>>>>> packet size, so, configures the correct max packet size in
>>>>>>>>> dev_config
>>>>>> ops.
>>>>>>>>>
>>>>>>>>> Fixes: 50cc9d2a6e9d ("net/ice: fix max frame size")
>>>>>>>>>
>>>>>>>>> Signed-off-by: SteveX Yang <stevex.yang at intel.com>
>>>>>>>>> ---
>>>>>>>>>    drivers/net/ice/ice_ethdev.c | 11 +++++++++++
>>>>>>>>>    1 file changed, 11 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/net/ice/ice_ethdev.c
>>>>>>>>> b/drivers/net/ice/ice_ethdev.c index
>>>>>>>>> cfd357b05..6b7098444 100644
>>>>>>>>> --- a/drivers/net/ice/ice_ethdev.c
>>>>>>>>> +++ b/drivers/net/ice/ice_ethdev.c
>>>>>>>>> @@ -3146,6 +3146,7 @@ ice_dev_configure(struct rte_eth_dev
>>> *dev)
>>>>>>>>> struct ice_adapter *ad =
>>>>>>>>> ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
>>>>>>>>>    struct ice_pf *pf =
>>>>>>>>> ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
>>>>>>>>> +uint32_t frame_size = dev->data->mtu + ICE_ETH_OVERHEAD;
>>>>>>>>>    int ret;
>>>>>>>>>
>>>>>>>>>    /* Initialize to TRUE. If any of Rx queues doesn't meet the @@
>>>>>>>>> -3157,6
>>>>>>>>> +3158,16 @@ ice_dev_configure(struct rte_eth_dev *dev)
>>>>>>>>>    if (dev->data->dev_conf.rxmode.mq_mode &
>>> ETH_MQ_RX_RSS_FLAG)
>>>>>>>>> dev->data->dev_conf.rxmode.offloads |=
>>>>>> DEV_RX_OFFLOAD_RSS_HASH;
>>>>>>>>>
>>>>>>>>> +/**
>>>>>>>>> + * Considering QinQ packet, max frame size should be equal or
>>>>>>>>> + * larger than total size of MTU and Ether overhead.
>>>>>>>>> + */
>>>>>>>>
>>>>>>>>> +if (frame_size > dev->data->dev_conf.rxmode.max_rx_pkt_len) {
>>>>>>>>
>>>>>>>>
>>>>>>>> Why we need this check?
>>>>>>>> Can we just call ice_mtu_set directly
>>>>>>>
>>>>>>> I think that without that check we can silently overwrite provided
>>>>>>> by user dev_conf.rxmode.max_rx_pkt_len value.
>>>>>>
>>>>>> OK, I see
>>>>>>
>>>>>> But still have one question
>>>>>> dev->data->mtu is initialized to 1518 as default , but if
>>>>>> dev->data->application set
>>>>>> dev_conf.rxmode.max_rx_pkt_len = 1000 in dev_configure.
>>>>>> does that mean we will still will set mtu to 1518, is this expected?
>>>>>>
>>>>>
>>>>> max_rx_pkt_len should be larger than mtu at least, so we should raise
>>>>> the max_rx_pkt_len (e.g.:1518) to hold expected mtu value (e.g.: 1500).
>>>>
>>>> Ok, this describe the problem more general and better to replace exist
>>> code comment and commit log for easy understanding.
>>>> Please send a new version for reword
>>>>
>>>
>>> I didn't really get this set.
>>>
>>> Application explicitly sets 'max_rx_pkt_len' to '1518', and a frame bigger than
>>> this size is dropped.
>>
>> Sure, it is normal case for dropping oversize data.
>>
>>> Isn't this what should be, why we are trying to overwrite user configuration
>>> in PMD to prevent this?
>>>
>>
>> But it is a confliction that application/user sets mtu & max_rx_pkt_len at the same time.
>> This fix will make a decision when confliction occurred.
>> MTU value will come from user operation (e.g.: port config mtu 0 1500) directly,
>> so, the max_rx_pkt_len will resize itself to adapt expected MTU value if its size is smaller than MTU + Ether overhead.
>>
>>> During eth_dev allocation, mtu set to default '1500', by ethdev layer.
>>> And testpmd sets 'max_rx_pkt_len' by default to '1518'.
>>> I think Qi's concern above is valid, what is user set 'max_rx_pkt_len' to '1000'
>>> and mean it? PMD will not honor the user config.
>>
>> I'm not sure when set 'mtu' to '1500' and 'max_rx_pkt_len' to '1000', what's the behavior expected?
>> If still keep the 'max_rx_pkt_len' value, that means the larger 'mtu' will be invalid.
>>
>>>
>>> Why not simply increase the default 'max_rx_pkt_len' in testpmd?
>>>
>> The default 'max_rx_pkt_len' has been initialized to generical value (1518) and default 'mtu' is '1500' in testpmd,
>> But it isn't suitable to those NIC drivers which Ether overhead is larger than 18. (e.g.: ice, i40e) if 'mtu' value is preferable.
>>
>>> And I guess even better what we need is to tell to the application what the
>>> frame overhead PMD accepts.
>>> So the application can set proper 'max_rx_pkt_len' value per port for a
>>> given/requested MTU value.
>>> @Ian, cc'ed, was complaining almost same thing years ago, these PMD
>>> overhead macros and 'max_mtu'/'min_mtu' added because of that, perhaps
>>> he has a solution now?
> 
>  From my perspective the main problem here:
> We have 2 different variables for nearly the same thing:
> rte_eth_dev_data.mtu and rte_eth_dev_data.dev_conf.max_rx_pkt_len.
> and 2 different API to update them: dev_mtu_set() and dev_configure().

According API 'max_rx_pkt_len' is 'Only used if JUMBO_FRAME enabled'
Although not sure that is practically what is done for all drivers.

> And inside majority of Intel PMDs we don't keep these 2 variables in sync:
> - mtu_set() will update both variables.
> - dev_configure() will update only max_rx_pkt_len, but will keep mtu intact.
> 
> This patch fixes this inconsistency, which I think is a good thing.
> Though yes, it introduces change in behaviour.
> 
> Let say the code:
> rte_eth_dev_set_mtu(port, 1500);
> dev_conf.max_rx_pkt_len = 1000;
> rte_eth_dev_configure(port, 1, 1, &dev_conf);
> 

'rte_eth_dev_configure()' is one of the first APIs called, it is called before 
'rte_eth_dev_set_mtu().

When 'rte_eth_dev_configure()' is called, MTU is set to '1500' by default by 
ethdev layer, so it is not user configuration, but 'max_rx_pkt_len' is.

And later, when 'rte_eth_dev_set_mtu()' is called, but MTU and 'max_rx_pkt_len' 
are updated (mostly).


> Before the patch will result:
> mtu==1500, max_rx_pkt_len=1000;  //out of sync looks wrong to me
> 
> After the patch:
> mtu=1500, max_rx_ptk_len=1518; // in sync, change in behaviour.
> 
> If you think we need to preserve current behaviour,
> then I suppose the easiest thing would be to change dev_config() code
> to update mtu value based on max_rx_pkt_len.
> I.E: dev_configure {...; mtu_set(max_rx_pkt_len - OVERHEAD); ...}
> So the code snippet above will result:
> mtu=982,max_rx_pkt_len=1000;
> 

The 'max_rx_ptk_len' is annoyance for a long time, what do you think to just 
drop it?

By default device will be up with default MTU (1500), later 
'rte_eth_dev_set_mtu' can be used to set the MTU, no frame size setting at all.

Will this work?



And for short term, for above Intel PMDs, there must be a place this 
'max_rx_pkt_len' value taken into account (mostly 'start()' dev_ops), that 
function can be updated to take 'max_rx_pkt_len' only if JUMBO_FRAME set, 
otherwise use the 'MTU' value.

Without 'start()' updated the current logic won't work after stop & start anyway.



>   Konstantin
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>
>>>
>>> And why this same thing can't happen to other PMDs? If this is a problem for
>>> all PMDs, we should solve in other level, not for only some PMDs.
>>>
>> No, all PMDs exist the same issue, another proposal:
>>   -  rte_ethdev provides the unique resize 'max_rx_pkt_len' in rte_eth_dev_configure();
>>   - provide the uniform API for fetching the NIC's supported Ether Overhead size;
>> Is it feasible?
>>
>>>>
>>>>> Generally, the mtu value can be adjustable from user (e.g.: ip link
>>>>> set ens801f0 mtu 1400), hence, we just adjust the max_rx_pkt_len to
>>>>> satisfy mtu requirement.
>>>>>
>>>>>> Should we just call ice_mtu_set(dev, dev_conf.rxmode.max_rx_pkt_len)
>>>>>> here?
>>>>> ice_mtu_set(dev, mtu) will append ether overhead to
>>>>> frame_size/max_rx_pkt_len, so we need pass the mtu value as the 2nd
>>>>> parameter, or not the max_rx_pkt_len.
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>> And please remove above comment, since ether overhead is already
>>>>>>> considered in ice_mtu_set.
>>>>> Ether overhead is already considered in ice_mtu_set, but it also
>>>>> should be considered as the adjustment condition that if ice_mtu_set
>>> need be invoked.
>>>>> So, it perhaps should remain this comment before this if() condition.
>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> +ret = ice_mtu_set(dev, dev->data->mtu); if (ret != 0) return
>>>>>>>>> +ret; }
>>>>>>>>> +
>>>>>>>>>    ret = ice_init_rss(pf);
>>>>>>>>>    if (ret) {
>>>>>>>>>    PMD_DRV_LOG(ERR, "Failed to enable rss for PF");
>>>>>>>>> --
>>>>>>>>> 2.17.1
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
> 



More information about the dev mailing list