[dpdk-dev] [PATCH v4 3/5] net/ice: fix max mtu size packets with vlan tag cannot be received by default

Ananyev, Konstantin konstantin.ananyev at intel.com
Mon Oct 19 16:28:52 CEST 2020


> 
> > >>>>>>>>>
> > >>>>>>>>> testpmd will initialize default max packet length to 1518 which
> > >>>>>>>>> doesn't include vlan tag size in ether overheader. Once, send the
> > >>>>>>>>> max mtu length packet with vlan tag, the max packet length will
> > >>>>>>>>> exceed 1518 that will cause packets dropped directly from NIC hw
> > >>>>> side.
> > >>>>>>>>>
> > >>>>>>>>> ice can support dual vlan tags that need more 8 bytes for max
> > >>>>>>>>> packet size, so, configures the correct max packet size in
> > >>>>>>>>> dev_config
> > >>>>>> ops.
> > >>>>>>>>>
> > >>>>>>>>> Fixes: 50cc9d2a6e9d ("net/ice: fix max frame size")
> > >>>>>>>>>
> > >>>>>>>>> Signed-off-by: SteveX Yang <stevex.yang at intel.com>
> > >>>>>>>>> ---
> > >>>>>>>>>    drivers/net/ice/ice_ethdev.c | 11 +++++++++++
> > >>>>>>>>>    1 file changed, 11 insertions(+)
> > >>>>>>>>>
> > >>>>>>>>> diff --git a/drivers/net/ice/ice_ethdev.c
> > >>>>>>>>> b/drivers/net/ice/ice_ethdev.c index
> > >>>>>>>>> cfd357b05..6b7098444 100644
> > >>>>>>>>> --- a/drivers/net/ice/ice_ethdev.c
> > >>>>>>>>> +++ b/drivers/net/ice/ice_ethdev.c
> > >>>>>>>>> @@ -3146,6 +3146,7 @@ ice_dev_configure(struct rte_eth_dev
> > >>> *dev)
> > >>>>>>>>> struct ice_adapter *ad =
> > >>>>>>>>> ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
> > >>>>>>>>>    struct ice_pf *pf =
> > >>>>>>>>> ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> > >>>>>>>>> +uint32_t frame_size = dev->data->mtu + ICE_ETH_OVERHEAD;
> > >>>>>>>>>    int ret;
> > >>>>>>>>>
> > >>>>>>>>>    /* Initialize to TRUE. If any of Rx queues doesn't meet the @@
> > >>>>>>>>> -3157,6
> > >>>>>>>>> +3158,16 @@ ice_dev_configure(struct rte_eth_dev *dev)
> > >>>>>>>>>    if (dev->data->dev_conf.rxmode.mq_mode &
> > >>> ETH_MQ_RX_RSS_FLAG)
> > >>>>>>>>> dev->data->dev_conf.rxmode.offloads |=
> > >>>>>> DEV_RX_OFFLOAD_RSS_HASH;
> > >>>>>>>>>
> > >>>>>>>>> +/**
> > >>>>>>>>> + * Considering QinQ packet, max frame size should be equal or
> > >>>>>>>>> + * larger than total size of MTU and Ether overhead.
> > >>>>>>>>> + */
> > >>>>>>>>
> > >>>>>>>>> +if (frame_size > dev->data->dev_conf.rxmode.max_rx_pkt_len) {
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Why we need this check?
> > >>>>>>>> Can we just call ice_mtu_set directly
> > >>>>>>>
> > >>>>>>> I think that without that check we can silently overwrite provided
> > >>>>>>> by user dev_conf.rxmode.max_rx_pkt_len value.
> > >>>>>>
> > >>>>>> OK, I see
> > >>>>>>
> > >>>>>> But still have one question
> > >>>>>> dev->data->mtu is initialized to 1518 as default , but if
> > >>>>>> dev->data->application set
> > >>>>>> dev_conf.rxmode.max_rx_pkt_len = 1000 in dev_configure.
> > >>>>>> does that mean we will still will set mtu to 1518, is this expected?
> > >>>>>>
> > >>>>>
> > >>>>> max_rx_pkt_len should be larger than mtu at least, so we should raise
> > >>>>> the max_rx_pkt_len (e.g.:1518) to hold expected mtu value (e.g.: 1500).
> > >>>>
> > >>>> Ok, this describe the problem more general and better to replace exist
> > >>> code comment and commit log for easy understanding.
> > >>>> Please send a new version for reword
> > >>>>
> > >>>
> > >>> I didn't really get this set.
> > >>>
> > >>> Application explicitly sets 'max_rx_pkt_len' to '1518', and a frame bigger than
> > >>> this size is dropped.
> > >>
> > >> Sure, it is normal case for dropping oversize data.
> > >>
> > >>> Isn't this what should be, why we are trying to overwrite user configuration
> > >>> in PMD to prevent this?
> > >>>
> > >>
> > >> But it is a confliction that application/user sets mtu & max_rx_pkt_len at the same time.
> > >> This fix will make a decision when confliction occurred.
> > >> MTU value will come from user operation (e.g.: port config mtu 0 1500) directly,
> > >> so, the max_rx_pkt_len will resize itself to adapt expected MTU value if its size is smaller than MTU + Ether overhead.
> > >>
> > >>> During eth_dev allocation, mtu set to default '1500', by ethdev layer.
> > >>> And testpmd sets 'max_rx_pkt_len' by default to '1518'.
> > >>> I think Qi's concern above is valid, what is user set 'max_rx_pkt_len' to '1000'
> > >>> and mean it? PMD will not honor the user config.
> > >>
> > >> I'm not sure when set 'mtu' to '1500' and 'max_rx_pkt_len' to '1000', what's the behavior expected?
> > >> If still keep the 'max_rx_pkt_len' value, that means the larger 'mtu' will be invalid.
> > >>
> > >>>
> > >>> Why not simply increase the default 'max_rx_pkt_len' in testpmd?
> > >>>
> > >> The default 'max_rx_pkt_len' has been initialized to generical value (1518) and default 'mtu' is '1500' in testpmd,
> > >> But it isn't suitable to those NIC drivers which Ether overhead is larger than 18. (e.g.: ice, i40e) if 'mtu' value is preferable.
> > >>
> > >>> And I guess even better what we need is to tell to the application what the
> > >>> frame overhead PMD accepts.
> > >>> So the application can set proper 'max_rx_pkt_len' value per port for a
> > >>> given/requested MTU value.
> > >>> @Ian, cc'ed, was complaining almost same thing years ago, these PMD
> > >>> overhead macros and 'max_mtu'/'min_mtu' added because of that, perhaps
> > >>> he has a solution now?
> > >
> > >  From my perspective the main problem here:
> > > We have 2 different variables for nearly the same thing:
> > > rte_eth_dev_data.mtu and rte_eth_dev_data.dev_conf.max_rx_pkt_len.
> > > and 2 different API to update them: dev_mtu_set() and dev_configure().
> >
> > According API 'max_rx_pkt_len' is 'Only used if JUMBO_FRAME enabled'
> > Although not sure that is practically what is done for all drivers.
> 
> I think most of Intel PMDs use it unconditionally.
> 
> >
> > > And inside majority of Intel PMDs we don't keep these 2 variables in sync:
> > > - mtu_set() will update both variables.
> > > - dev_configure() will update only max_rx_pkt_len, but will keep mtu intact.
> > >
> > > This patch fixes this inconsistency, which I think is a good thing.
> > > Though yes, it introduces change in behaviour.
> > >
> > > Let say the code:
> > > rte_eth_dev_set_mtu(port, 1500);
> > > dev_conf.max_rx_pkt_len = 1000;
> > > rte_eth_dev_configure(port, 1, 1, &dev_conf);
> > >
> >
> > 'rte_eth_dev_configure()' is one of the first APIs called, it is called before
> > 'rte_eth_dev_set_mtu().
> 
> Usually yes.
> But you can still do sometimes later: dev_mtu_set(); ...; dev_stop(); dev_configure(); dev_start();
> 
> >
> > When 'rte_eth_dev_configure()' is called, MTU is set to '1500' by default by
> > ethdev layer, so it is not user configuration, but 'max_rx_pkt_len' is.
> 
> See above.
> PMD doesn't know where this MTU value came from (default ethdev value or user specified value)
> and probably it shouldn't care.
> 
> >
> > And later, when 'rte_eth_dev_set_mtu()' is called, but MTU and 'max_rx_pkt_len'
> > are updated (mostly).
> 
> Yes, in mtu_set() we update both.
> But we don't update MTU in dev_configure(), only max_rx_pkt_len.
> That what this patch tries to fix (as I understand it).

To be more precise - it doesn't change MTU value in dev_configure(),
but instead doesn't allow max_rx_pkt_len to become smaller
then MTU + OVERHEAD. 
Probably changing MTU value instead is a better choice.

> >
> >
> > > Before the patch will result:
> > > mtu==1500, max_rx_pkt_len=1000;  //out of sync looks wrong to me
> > >
> > > After the patch:
> > > mtu=1500, max_rx_ptk_len=1518; // in sync, change in behaviour.
> > >
> > > If you think we need to preserve current behaviour,
> > > then I suppose the easiest thing would be to change dev_config() code
> > > to update mtu value based on max_rx_pkt_len.
> > > I.E: dev_configure {...; mtu_set(max_rx_pkt_len - OVERHEAD); ...}
> > > So the code snippet above will result:
> > > mtu=982,max_rx_pkt_len=1000;
> > >
> >
> > The 'max_rx_ptk_len' is annoyance for a long time, what do you think to just
> > drop it?
> >
> > By default device will be up with default MTU (1500), later
> > 'rte_eth_dev_set_mtu' can be used to set the MTU, no frame size setting at all.
> >
> > Will this work?
> 
> I think it might, but that's a big change, probably too risky at that stage...
> 
> 
> >
> >
> > And for short term, for above Intel PMDs, there must be a place this
> > 'max_rx_pkt_len' value taken into account (mostly 'start()' dev_ops), that
> > function can be updated to take 'max_rx_pkt_len' only if JUMBO_FRAME set,
> > otherwise use the 'MTU' value.
> 
> Even if we'll use max_rx_pkt_len only when if JUMBO_FRAME is set,
> I think we still need to keep max_rx_pkt_len and MTU values in sync.
> 
> >
> > Without 'start()' updated the current logic won't work after stop & start anyway.
> >
> >
> > >
> > >
> > >
> > >>
> > >>>
> > >>> And why this same thing can't happen to other PMDs? If this is a problem for
> > >>> all PMDs, we should solve in other level, not for only some PMDs.
> > >>>
> > >> No, all PMDs exist the same issue, another proposal:
> > >>   -  rte_ethdev provides the unique resize 'max_rx_pkt_len' in rte_eth_dev_configure();
> > >>   - provide the uniform API for fetching the NIC's supported Ether Overhead size;
> > >> Is it feasible?
> > >>
> > >>>>
> > >>>>> Generally, the mtu value can be adjustable from user (e.g.: ip link
> > >>>>> set ens801f0 mtu 1400), hence, we just adjust the max_rx_pkt_len to
> > >>>>> satisfy mtu requirement.
> > >>>>>
> > >>>>>> Should we just call ice_mtu_set(dev, dev_conf.rxmode.max_rx_pkt_len)
> > >>>>>> here?
> > >>>>> ice_mtu_set(dev, mtu) will append ether overhead to
> > >>>>> frame_size/max_rx_pkt_len, so we need pass the mtu value as the 2nd
> > >>>>> parameter, or not the max_rx_pkt_len.
> > >>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>>
> > >>>>>>>> And please remove above comment, since ether overhead is already
> > >>>>>>> considered in ice_mtu_set.
> > >>>>> Ether overhead is already considered in ice_mtu_set, but it also
> > >>>>> should be considered as the adjustment condition that if ice_mtu_set
> > >>> need be invoked.
> > >>>>> So, it perhaps should remain this comment before this if() condition.
> > >>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>> +ret = ice_mtu_set(dev, dev->data->mtu); if (ret != 0) return
> > >>>>>>>>> +ret; }
> > >>>>>>>>> +
> > >>>>>>>>>    ret = ice_init_rss(pf);
> > >>>>>>>>>    if (ret) {
> > >>>>>>>>>    PMD_DRV_LOG(ERR, "Failed to enable rss for PF");
> > >>>>>>>>> --
> > >>>>>>>>> 2.17.1
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >



More information about the dev mailing list