[dpdk-dev] [PATCH 1/4] ethdev: fix max Rx packet length

Andrew Rybchenko andrew.rybchenko at oktetlabs.ru
Thu Jul 22 12:15:04 CEST 2021


On 7/22/21 1:12 PM, Ferruh Yigit wrote:
> On 7/22/2021 8:21 AM, Huisong Li wrote:
>>
>> 在 2021/7/21 23:29, Ferruh Yigit 写道:
>>> On 7/19/2021 4:35 AM, Huisong Li wrote:
>>>> Hi, Ferruh
>>>>
>>> Hi Huisong,
>>>
>>> Thanks for the review.
>>>
>>>> 在 2021/7/10 1:29, Ferruh Yigit 写道:
>>>>> There is a confusion on setting max Rx packet length, this patch aims to
>>>>> clarify it.
>>>>>
>>>>> 'rte_eth_dev_configure()' API accepts max Rx packet size via
>>>>> 'uint32_t max_rx_pkt_len' filed of the config struct 'struct
>>>>> rte_eth_conf'.
>>>>>
>>>>> Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
>>>>> stored into '(struct rte_eth_dev)->data->mtu'.
>>>>>
>>>>> These two APIs are related but they work in a disconnected way, they
>>>>> store the set values in different variables which makes hard to figure
>>>>> out which one to use, also two different related method is confusing for
>>>>> the users.
>>>>>
>>>>> Other issues causing confusion is:
>>>>> * maximum transmission unit (MTU) is payload of the Ethernet frame. And
>>>>>      'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
>>>>>      Ethernet frame overhead, but this may be different from device to
>>>>>      device based on what device supports, like VLAN and QinQ.
>>>>> * 'max_rx_pkt_len' is only valid when application requested jumbo frame,
>>>>>      which adds additional confusion and some APIs and PMDs already
>>>>>      discards this documented behavior.
>>>>> * For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
>>>>>      field, this adds configuration complexity for application.
>>>>>
>>>>> As solution, both APIs gets MTU as parameter, and both saves the result
>>>>> in same variable '(struct rte_eth_dev)->data->mtu'. For this
>>>>> 'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
>>>>> from jumbo frame.
>>>>>
>>>>> For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
>>>>> request and it should be used only within configure function and result
>>>>> should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
>>>>> both application and PMD uses MTU from this variable.
>>>>>
>>>>> When application doesn't provide an MTU during 'rte_eth_dev_configure()'
>>>>> default 'RTE_ETHER_MTU' value is used.
>>>>>
>>>>> As additional clarification, MTU is used to configure the device for
>>>>> physical Rx/Tx limitation. Other related issue is size of the buffer to
>>>>> store Rx packets, many PMDs use mbuf data buffer size as Rx buffer size.
>>>>> And compares MTU against Rx buffer size to decide enabling scattered Rx
>>>>> or not, if PMD supports it. If scattered Rx is not supported by device,
>>>>> MTU bigger than Rx buffer size should fail.
>>>>>
>>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit at intel.com>
>>> <...>
>>>
>>>>> diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
>>>>> index e51512560e15..8bccdeddb2f7 100644
>>>>> --- a/drivers/net/hns3/hns3_ethdev.c
>>>>> +++ b/drivers/net/hns3/hns3_ethdev.c
>>>>> @@ -2379,20 +2379,11 @@ hns3_refresh_mtu(struct rte_eth_dev *dev, struct
>>>>> rte_eth_conf *conf)
>>>>>     {
>>>>>         struct hns3_adapter *hns = dev->data->dev_private;
>>>>>         struct hns3_hw *hw = &hns->hw;
>>>>> -    uint32_t max_rx_pkt_len;
>>>>> -    uint16_t mtu;
>>>>> -    int ret;
>>>>> -
>>>>> -    if (!(conf->rxmode.offloads & DEV_RX_OFFLOAD_JUMBO_FRAME))
>>>>> -        return 0;
>>>>> +    uint32_t max_rx_pktlen;
>>>>>     -    /*
>>>>> -     * If jumbo frames are enabled, MTU needs to be refreshed
>>>>> -     * according to the maximum RX packet length.
>>>>> -     */
>>>>> -    max_rx_pkt_len = conf->rxmode.max_rx_pkt_len;
>>>>> -    if (max_rx_pkt_len > HNS3_MAX_FRAME_LEN ||
>>>>> -        max_rx_pkt_len <= HNS3_DEFAULT_FRAME_LEN) {
>>>>> +    max_rx_pktlen = conf->rxmode.mtu + HNS3_ETH_OVERHEAD;
>>>>> +    if (max_rx_pktlen > HNS3_MAX_FRAME_LEN ||
>>>>> +        max_rx_pktlen <= HNS3_DEFAULT_FRAME_LEN) {
>>>>>             hns3_err(hw, "maximum Rx packet length must be greater than %u "
>>>>>                  "and no more than %u when jumbo frame enabled.",
>>>>>                  (uint16_t)HNS3_DEFAULT_FRAME_LEN,
>>>> The preceding check for the maximum frame length was based on the scenario where
>>>> jumbo frames are enabled.
>>>>
>>>> Since there is no offload of jumbo frames in this patchset, the maximum frame
>>>> length does not need to be checked and only ensure conf->rxmode.mtu is valid.
>>>>
>>>> These should be guaranteed by dev_configure() in the framework .
>>>>
>>> Got it, agree that 'HNS3_DEFAULT_FRAME_LEN' check is now wrong, and as you said
>>> these checks are becoming redundant, so I will remove them.
>>>
>>> In that case 'hns3_refresh_mtu()' becomes just wrapper to 'hns3_dev_mtu_set()',
>>> I will remove function too.
>>>
>>> <...>
>> ok
>>>
>>>>> diff --git a/drivers/net/hns3/hns3_ethdev_vf.c
>>>>> b/drivers/net/hns3/hns3_ethdev_vf.c
>>>>> index e582503f529b..ca839fa55fa0 100644
>>>>> --- a/drivers/net/hns3/hns3_ethdev_vf.c
>>>>> +++ b/drivers/net/hns3/hns3_ethdev_vf.c
>>>>> @@ -784,8 +784,7 @@ hns3vf_dev_configure(struct rte_eth_dev *dev)
>>>>>         uint16_t nb_rx_q = dev->data->nb_rx_queues;
>>>>>         uint16_t nb_tx_q = dev->data->nb_tx_queues;
>>>>>         struct rte_eth_rss_conf rss_conf;
>>>>> -    uint32_t max_rx_pkt_len;
>>>>> -    uint16_t mtu;
>>>>> +    uint32_t max_rx_pktlen;
>>>>>         bool gro_en;
>>>>>         int ret;
>>>>>     @@ -825,29 +824,21 @@ hns3vf_dev_configure(struct rte_eth_dev *dev)
>>>>>                 goto cfg_err;
>>>>>         }
>>>>>     -    /*
>>>>> -     * If jumbo frames are enabled, MTU needs to be refreshed
>>>>> -     * according to the maximum RX packet length.
>>>>> -     */
>>>>> -    if (conf->rxmode.offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) {
>>>>> -        max_rx_pkt_len = conf->rxmode.max_rx_pkt_len;
>>>>> -        if (max_rx_pkt_len > HNS3_MAX_FRAME_LEN ||
>>>>> -            max_rx_pkt_len <= HNS3_DEFAULT_FRAME_LEN) {
>>>>> -            hns3_err(hw, "maximum Rx packet length must be greater "
>>>>> -                 "than %u and less than %u when jumbo frame enabled.",
>>>>> -                 (uint16_t)HNS3_DEFAULT_FRAME_LEN,
>>>>> -                 (uint16_t)HNS3_MAX_FRAME_LEN);
>>>>> -            ret = -EINVAL;
>>>>> -            goto cfg_err;
>>>>> -        }
>>>>> -
>>>>> -        mtu = (uint16_t)HNS3_PKTLEN_TO_MTU(max_rx_pkt_len);
>>>>> -        ret = hns3vf_dev_mtu_set(dev, mtu);
>>>>> -        if (ret)
>>>>> -            goto cfg_err;
>>>>> -        dev->data->mtu = mtu;
>>>>> +    max_rx_pktlen = conf->rxmode.mtu + HNS3_ETH_OVERHEAD;
>>>>> +    if (max_rx_pktlen > HNS3_MAX_FRAME_LEN ||
>>>>> +        max_rx_pktlen <= HNS3_DEFAULT_FRAME_LEN) {
>>>>> +        hns3_err(hw, "maximum Rx packet length must be greater "
>>>>> +             "than %u and less than %u when jumbo frame enabled.",
>>>>> +             (uint16_t)HNS3_DEFAULT_FRAME_LEN,
>>>>> +             (uint16_t)HNS3_MAX_FRAME_LEN);
>>>>> +        ret = -EINVAL;
>>>>> +        goto cfg_err;
>>>>>         }
>>>> Please remove this check now, thanks!
>>> ack
>>>
>>> <...>
>>>
>>>>> diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
>>>>> index ce8882a45883..f868e5d906c7 100644
>>>>> --- a/examples/ip_reassembly/main.c
>>>>> +++ b/examples/ip_reassembly/main.c
>>>>> @@ -162,7 +162,7 @@ static struct lcore_queue_conf
>>>>> lcore_queue_conf[RTE_MAX_LCORE];
>>>>>     static struct rte_eth_conf port_conf = {
>>>>>         .rxmode = {
>>>>>             .mq_mode        = ETH_MQ_RX_RSS,
>>>>> -        .max_rx_pkt_len = JUMBO_FRAME_MAX_SIZE,
>>>>> +        .mtu = JUMBO_FRAME_MAX_SIZE,
>>>> It feel likes that the replacement of max_rx_pkt_len with MTU is inappropriate.
>>>>
>>>> Because "max_rx_pkt_len " is the sum of  "mtu" and "overhead_len".
>>> You are right, it is not same thing. I will update it to remove overhead.
>>>
>>> <...>
>>>
>>>>> @@ -1448,49 +1459,45 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t
>>>>> nb_rx_q, uint16_t nb_tx_q,
>>>>>         }
>>>>>           /*
>>>>> -     * If jumbo frames are enabled, check that the maximum RX packet
>>>>> -     * length is supported by the configured device.
>>>>> +     * Check that the maximum RX packet length is supported by the
>>>>> +     * configured device.
>>>>>          */
>>>>> -    if (dev_conf->rxmode.offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) {
>>>>> -        if (dev_conf->rxmode.max_rx_pkt_len > dev_info.max_rx_pktlen) {
>>>>> -            RTE_ETHDEV_LOG(ERR,
>>>>> -                "Ethdev port_id=%u max_rx_pkt_len %u > max valid value %u\n",
>>>>> -                port_id, dev_conf->rxmode.max_rx_pkt_len,
>>>>> -                dev_info.max_rx_pktlen);
>>>>> -            ret = -EINVAL;
>>>>> -            goto rollback;
>>>>> -        } else if (dev_conf->rxmode.max_rx_pkt_len < RTE_ETHER_MIN_LEN) {
>>>>> -            RTE_ETHDEV_LOG(ERR,
>>>>> -                "Ethdev port_id=%u max_rx_pkt_len %u < min valid value %u\n",
>>>>> -                port_id, dev_conf->rxmode.max_rx_pkt_len,
>>>>> -                (unsigned int)RTE_ETHER_MIN_LEN);
>>>>> -            ret = -EINVAL;
>>>>> -            goto rollback;
>>>>> -        }
>>>>> +    if (dev_conf->rxmode.mtu == 0)
>>>>> +        dev->data->dev_conf.rxmode.mtu = RTE_ETHER_MTU;
>>>> Here, it will cause a case that the user configuration is inconsistent with the
>>>> configuration saved in the framework .
>>> What is the framework you mentioned?
>>>
>>> Previously 'max_rx_pkt_len' was mandatory when jumbo frame is configured even
>>> user doesn't really case about it and it was causing additional complexity in
>>> the configuration.
>>> This check is required to use defaults when application doesn't need a specific
>>> value, I believe this is a good usability improvement.
>>> Application who cares about a specific value can set it explicitly and it will
>>> be in sync with application.
>>>
>>>> Is it more reasonable to provide a prompt message?
>>> Not sure about it. We are not changing a user configured value, but using
>>> default value when application doesn't set it, and that kind of log will be
>>> printed by most of the applications, this may cause noise.
>> This is a good reason.
>>>
>>>>> +    max_rx_pktlen = dev->data->dev_conf.rxmode.mtu + overhead_len;
>>>>> +    if (max_rx_pktlen > dev_info.max_rx_pktlen) {
>>>>> +        RTE_ETHDEV_LOG(ERR,
>>>>> +            "Ethdev port_id=%u max_rx_pktlen %u > max valid value %u\n",
>>>>> +            port_id, max_rx_pktlen, dev_info.max_rx_pktlen);
>>>>> +        ret = -EINVAL;
>>>>> +        goto rollback;
>>>>> +    } else if (max_rx_pktlen < RTE_ETHER_MIN_LEN) {
>>>>> +        RTE_ETHDEV_LOG(ERR,
>>>>> +            "Ethdev port_id=%u max_rx_pktlen %u < min valid value %u\n",
>>>>> +            port_id, max_rx_pktlen, RTE_ETHER_MIN_LEN);
>>>>> +        ret = -EINVAL;
>>>>> +        goto rollback;
>>>>> +    }
>>>> Above "max_rx_pktlen < RTE_ETHER_MIN_LEN " case will be  inconsistent with
>>>> dev_set_mtu() API.
>>>>
>>>> The reasons are as follows:
>>>>
>>>> The value of RTE_ETHER_MIN_LEN is 64. If "overhead_len" is 26 caculated by
>>>> eth_dev_get_overhead_len(), it means
>>>>
>>>> that dev->data->dev_conf.rxmode.mtu equal to 38 is reasonable.
>>>>
>>>> But, in  dev_set_mtu() API, the check for mtu is:
>>>>
>>>> @@ -3643,12 +3644,27 @@ rte_eth_dev_set_mtu(uint16_t port_id, uint16_t mtu)
>>>>              if (mtu < dev_info.min_mtu || mtu > dev_info.max_mtu)
>>>>                return -EINVAL;
>>>>
>>>> It should be noted that dev_info.min_mtu is RTE_ETHER_MIN_MTU (68).
>>>>
>>> Agree on the inconsistency.
>>>
>>> RTE_ETHER_MIN_MTU is 68, that is min MTU for IPv4
>>> RTE_ETHER_MIN_LEN is 64, and min MTU for Ethernet frame
>>>
>>> Although we are talking about MTU, we are mainly concerned about Ethernet frame
>>> payload, not IPv4.
>>>
>>> I suggest only using RTE_ETHER_MIN_LEN.
>>> Since this inconsistency was already there before this patch, I will update it
>>> in seperate patch instead of fixing in this one.
>>> .
>>
>> Got it. Since the MTU value depends on the type of transmission link. Why does
>> we define
>>
>> a minimum MTU?
>>
> 
> I don't think we care about type of transmission in this level, I assume we
> define min MTU mainly for the HW limitation and configuration. That is why it
> makes sense to me to use Ethernet frame lenght limitation (not IPv4 one).

+1

>> True, we don't have to break the current restrictions in this patch. But it is
>> an indirect
>>
>> check on the MTU. Now that "mtu" in Rxmode is an entry for configuring MTU for
>> driver,
>>
>> I prefer to keep the same MTU check in ethdev layer. If there's a better way to
>> handle it,
>>
>> perhaps it would be more appropriate to do it in this patchset.
>>
>> I'd like to know how you're going to adjust。
>>
> 
> I am planning to move the MTU checks into common function and use it for both
> 'rte_eth_dev_configure()' & 'rte_eth_dev_set_mtu()' in a seperate patch. Please
> check v2.
> 



More information about the dev mailing list