[dpdk-dev] [PATCH 1/4] ethdev: fix max Rx packet length

Ferruh Yigit ferruh.yigit at intel.com
Thu Jul 22 12:12:10 CEST 2021


On 7/22/2021 8:21 AM, Huisong Li wrote:
> 
> 在 2021/7/21 23:29, Ferruh Yigit 写道:
>> On 7/19/2021 4:35 AM, Huisong Li wrote:
>>> Hi, Ferruh
>>>
>> Hi Huisong,
>>
>> Thanks for the review.
>>
>>> 在 2021/7/10 1:29, Ferruh Yigit 写道:
>>>> There is a confusion on setting max Rx packet length, this patch aims to
>>>> clarify it.
>>>>
>>>> 'rte_eth_dev_configure()' API accepts max Rx packet size via
>>>> 'uint32_t max_rx_pkt_len' filed of the config struct 'struct
>>>> rte_eth_conf'.
>>>>
>>>> Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
>>>> stored into '(struct rte_eth_dev)->data->mtu'.
>>>>
>>>> These two APIs are related but they work in a disconnected way, they
>>>> store the set values in different variables which makes hard to figure
>>>> out which one to use, also two different related method is confusing for
>>>> the users.
>>>>
>>>> Other issues causing confusion is:
>>>> * maximum transmission unit (MTU) is payload of the Ethernet frame. And
>>>>     'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
>>>>     Ethernet frame overhead, but this may be different from device to
>>>>     device based on what device supports, like VLAN and QinQ.
>>>> * 'max_rx_pkt_len' is only valid when application requested jumbo frame,
>>>>     which adds additional confusion and some APIs and PMDs already
>>>>     discards this documented behavior.
>>>> * For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
>>>>     field, this adds configuration complexity for application.
>>>>
>>>> As solution, both APIs gets MTU as parameter, and both saves the result
>>>> in same variable '(struct rte_eth_dev)->data->mtu'. For this
>>>> 'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
>>>> from jumbo frame.
>>>>
>>>> For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
>>>> request and it should be used only within configure function and result
>>>> should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
>>>> both application and PMD uses MTU from this variable.
>>>>
>>>> When application doesn't provide an MTU during 'rte_eth_dev_configure()'
>>>> default 'RTE_ETHER_MTU' value is used.
>>>>
>>>> As additional clarification, MTU is used to configure the device for
>>>> physical Rx/Tx limitation. Other related issue is size of the buffer to
>>>> store Rx packets, many PMDs use mbuf data buffer size as Rx buffer size.
>>>> And compares MTU against Rx buffer size to decide enabling scattered Rx
>>>> or not, if PMD supports it. If scattered Rx is not supported by device,
>>>> MTU bigger than Rx buffer size should fail.
>>>>
>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit at intel.com>
>> <...>
>>
>>>> diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
>>>> index e51512560e15..8bccdeddb2f7 100644
>>>> --- a/drivers/net/hns3/hns3_ethdev.c
>>>> +++ b/drivers/net/hns3/hns3_ethdev.c
>>>> @@ -2379,20 +2379,11 @@ hns3_refresh_mtu(struct rte_eth_dev *dev, struct
>>>> rte_eth_conf *conf)
>>>>    {
>>>>        struct hns3_adapter *hns = dev->data->dev_private;
>>>>        struct hns3_hw *hw = &hns->hw;
>>>> -    uint32_t max_rx_pkt_len;
>>>> -    uint16_t mtu;
>>>> -    int ret;
>>>> -
>>>> -    if (!(conf->rxmode.offloads & DEV_RX_OFFLOAD_JUMBO_FRAME))
>>>> -        return 0;
>>>> +    uint32_t max_rx_pktlen;
>>>>    -    /*
>>>> -     * If jumbo frames are enabled, MTU needs to be refreshed
>>>> -     * according to the maximum RX packet length.
>>>> -     */
>>>> -    max_rx_pkt_len = conf->rxmode.max_rx_pkt_len;
>>>> -    if (max_rx_pkt_len > HNS3_MAX_FRAME_LEN ||
>>>> -        max_rx_pkt_len <= HNS3_DEFAULT_FRAME_LEN) {
>>>> +    max_rx_pktlen = conf->rxmode.mtu + HNS3_ETH_OVERHEAD;
>>>> +    if (max_rx_pktlen > HNS3_MAX_FRAME_LEN ||
>>>> +        max_rx_pktlen <= HNS3_DEFAULT_FRAME_LEN) {
>>>>            hns3_err(hw, "maximum Rx packet length must be greater than %u "
>>>>                 "and no more than %u when jumbo frame enabled.",
>>>>                 (uint16_t)HNS3_DEFAULT_FRAME_LEN,
>>> The preceding check for the maximum frame length was based on the scenario where
>>> jumbo frames are enabled.
>>>
>>> Since there is no offload of jumbo frames in this patchset, the maximum frame
>>> length does not need to be checked and only ensure conf->rxmode.mtu is valid.
>>>
>>> These should be guaranteed by dev_configure() in the framework .
>>>
>> Got it, agree that 'HNS3_DEFAULT_FRAME_LEN' check is now wrong, and as you said
>> these checks are becoming redundant, so I will remove them.
>>
>> In that case 'hns3_refresh_mtu()' becomes just wrapper to 'hns3_dev_mtu_set()',
>> I will remove function too.
>>
>> <...>
> ok
>>
>>>> diff --git a/drivers/net/hns3/hns3_ethdev_vf.c
>>>> b/drivers/net/hns3/hns3_ethdev_vf.c
>>>> index e582503f529b..ca839fa55fa0 100644
>>>> --- a/drivers/net/hns3/hns3_ethdev_vf.c
>>>> +++ b/drivers/net/hns3/hns3_ethdev_vf.c
>>>> @@ -784,8 +784,7 @@ hns3vf_dev_configure(struct rte_eth_dev *dev)
>>>>        uint16_t nb_rx_q = dev->data->nb_rx_queues;
>>>>        uint16_t nb_tx_q = dev->data->nb_tx_queues;
>>>>        struct rte_eth_rss_conf rss_conf;
>>>> -    uint32_t max_rx_pkt_len;
>>>> -    uint16_t mtu;
>>>> +    uint32_t max_rx_pktlen;
>>>>        bool gro_en;
>>>>        int ret;
>>>>    @@ -825,29 +824,21 @@ hns3vf_dev_configure(struct rte_eth_dev *dev)
>>>>                goto cfg_err;
>>>>        }
>>>>    -    /*
>>>> -     * If jumbo frames are enabled, MTU needs to be refreshed
>>>> -     * according to the maximum RX packet length.
>>>> -     */
>>>> -    if (conf->rxmode.offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) {
>>>> -        max_rx_pkt_len = conf->rxmode.max_rx_pkt_len;
>>>> -        if (max_rx_pkt_len > HNS3_MAX_FRAME_LEN ||
>>>> -            max_rx_pkt_len <= HNS3_DEFAULT_FRAME_LEN) {
>>>> -            hns3_err(hw, "maximum Rx packet length must be greater "
>>>> -                 "than %u and less than %u when jumbo frame enabled.",
>>>> -                 (uint16_t)HNS3_DEFAULT_FRAME_LEN,
>>>> -                 (uint16_t)HNS3_MAX_FRAME_LEN);
>>>> -            ret = -EINVAL;
>>>> -            goto cfg_err;
>>>> -        }
>>>> -
>>>> -        mtu = (uint16_t)HNS3_PKTLEN_TO_MTU(max_rx_pkt_len);
>>>> -        ret = hns3vf_dev_mtu_set(dev, mtu);
>>>> -        if (ret)
>>>> -            goto cfg_err;
>>>> -        dev->data->mtu = mtu;
>>>> +    max_rx_pktlen = conf->rxmode.mtu + HNS3_ETH_OVERHEAD;
>>>> +    if (max_rx_pktlen > HNS3_MAX_FRAME_LEN ||
>>>> +        max_rx_pktlen <= HNS3_DEFAULT_FRAME_LEN) {
>>>> +        hns3_err(hw, "maximum Rx packet length must be greater "
>>>> +             "than %u and less than %u when jumbo frame enabled.",
>>>> +             (uint16_t)HNS3_DEFAULT_FRAME_LEN,
>>>> +             (uint16_t)HNS3_MAX_FRAME_LEN);
>>>> +        ret = -EINVAL;
>>>> +        goto cfg_err;
>>>>        }
>>> Please remove this check now, thanks!
>> ack
>>
>> <...>
>>
>>>> diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
>>>> index ce8882a45883..f868e5d906c7 100644
>>>> --- a/examples/ip_reassembly/main.c
>>>> +++ b/examples/ip_reassembly/main.c
>>>> @@ -162,7 +162,7 @@ static struct lcore_queue_conf
>>>> lcore_queue_conf[RTE_MAX_LCORE];
>>>>    static struct rte_eth_conf port_conf = {
>>>>        .rxmode = {
>>>>            .mq_mode        = ETH_MQ_RX_RSS,
>>>> -        .max_rx_pkt_len = JUMBO_FRAME_MAX_SIZE,
>>>> +        .mtu = JUMBO_FRAME_MAX_SIZE,
>>> It feel likes that the replacement of max_rx_pkt_len with MTU is inappropriate.
>>>
>>> Because "max_rx_pkt_len " is the sum of  "mtu" and "overhead_len".
>> You are right, it is not same thing. I will update it to remove overhead.
>>
>> <...>
>>
>>>> @@ -1448,49 +1459,45 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t
>>>> nb_rx_q, uint16_t nb_tx_q,
>>>>        }
>>>>          /*
>>>> -     * If jumbo frames are enabled, check that the maximum RX packet
>>>> -     * length is supported by the configured device.
>>>> +     * Check that the maximum RX packet length is supported by the
>>>> +     * configured device.
>>>>         */
>>>> -    if (dev_conf->rxmode.offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) {
>>>> -        if (dev_conf->rxmode.max_rx_pkt_len > dev_info.max_rx_pktlen) {
>>>> -            RTE_ETHDEV_LOG(ERR,
>>>> -                "Ethdev port_id=%u max_rx_pkt_len %u > max valid value %u\n",
>>>> -                port_id, dev_conf->rxmode.max_rx_pkt_len,
>>>> -                dev_info.max_rx_pktlen);
>>>> -            ret = -EINVAL;
>>>> -            goto rollback;
>>>> -        } else if (dev_conf->rxmode.max_rx_pkt_len < RTE_ETHER_MIN_LEN) {
>>>> -            RTE_ETHDEV_LOG(ERR,
>>>> -                "Ethdev port_id=%u max_rx_pkt_len %u < min valid value %u\n",
>>>> -                port_id, dev_conf->rxmode.max_rx_pkt_len,
>>>> -                (unsigned int)RTE_ETHER_MIN_LEN);
>>>> -            ret = -EINVAL;
>>>> -            goto rollback;
>>>> -        }
>>>> +    if (dev_conf->rxmode.mtu == 0)
>>>> +        dev->data->dev_conf.rxmode.mtu = RTE_ETHER_MTU;
>>> Here, it will cause a case that the user configuration is inconsistent with the
>>> configuration saved in the framework .
>> What is the framework you mentioned?
>>
>> Previously 'max_rx_pkt_len' was mandatory when jumbo frame is configured even
>> user doesn't really case about it and it was causing additional complexity in
>> the configuration.
>> This check is required to use defaults when application doesn't need a specific
>> value, I believe this is a good usability improvement.
>> Application who cares about a specific value can set it explicitly and it will
>> be in sync with application.
>>
>>> Is it more reasonable to provide a prompt message?
>> Not sure about it. We are not changing a user configured value, but using
>> default value when application doesn't set it, and that kind of log will be
>> printed by most of the applications, this may cause noise.
> This is a good reason.
>>
>>>> +    max_rx_pktlen = dev->data->dev_conf.rxmode.mtu + overhead_len;
>>>> +    if (max_rx_pktlen > dev_info.max_rx_pktlen) {
>>>> +        RTE_ETHDEV_LOG(ERR,
>>>> +            "Ethdev port_id=%u max_rx_pktlen %u > max valid value %u\n",
>>>> +            port_id, max_rx_pktlen, dev_info.max_rx_pktlen);
>>>> +        ret = -EINVAL;
>>>> +        goto rollback;
>>>> +    } else if (max_rx_pktlen < RTE_ETHER_MIN_LEN) {
>>>> +        RTE_ETHDEV_LOG(ERR,
>>>> +            "Ethdev port_id=%u max_rx_pktlen %u < min valid value %u\n",
>>>> +            port_id, max_rx_pktlen, RTE_ETHER_MIN_LEN);
>>>> +        ret = -EINVAL;
>>>> +        goto rollback;
>>>> +    }
>>> Above "max_rx_pktlen < RTE_ETHER_MIN_LEN " case will be  inconsistent with
>>> dev_set_mtu() API.
>>>
>>> The reasons are as follows:
>>>
>>> The value of RTE_ETHER_MIN_LEN is 64. If "overhead_len" is 26 caculated by
>>> eth_dev_get_overhead_len(), it means
>>>
>>> that dev->data->dev_conf.rxmode.mtu equal to 38 is reasonable.
>>>
>>> But, in  dev_set_mtu() API, the check for mtu is:
>>>
>>> @@ -3643,12 +3644,27 @@ rte_eth_dev_set_mtu(uint16_t port_id, uint16_t mtu)
>>>             if (mtu < dev_info.min_mtu || mtu > dev_info.max_mtu)
>>>               return -EINVAL;
>>>
>>> It should be noted that dev_info.min_mtu is RTE_ETHER_MIN_MTU (68).
>>>
>> Agree on the inconsistency.
>>
>> RTE_ETHER_MIN_MTU is 68, that is min MTU for IPv4
>> RTE_ETHER_MIN_LEN is 64, and min MTU for Ethernet frame
>>
>> Although we are talking about MTU, we are mainly concerned about Ethernet frame
>> payload, not IPv4.
>>
>> I suggest only using RTE_ETHER_MIN_LEN.
>> Since this inconsistency was already there before this patch, I will update it
>> in seperate patch instead of fixing in this one.
>> .
> 
> Got it. Since the MTU value depends on the type of transmission link. Why does
> we define
> 
> a minimum MTU?
> 

I don't think we care about type of transmission in this level, I assume we
define min MTU mainly for the HW limitation and configuration. That is why it
makes sense to me to use Ethernet frame lenght limitation (not IPv4 one).

> True, we don't have to break the current restrictions in this patch. But it is
> an indirect
> 
> check on the MTU. Now that "mtu" in Rxmode is an entry for configuring MTU for
> driver,
> 
> I prefer to keep the same MTU check in ethdev layer. If there's a better way to
> handle it,
> 
> perhaps it would be more appropriate to do it in this patchset.
> 
> I'd like to know how you're going to adjust。
> 

I am planning to move the MTU checks into common function and use it for both
'rte_eth_dev_configure()' & 'rte_eth_dev_set_mtu()' in a seperate patch. Please
check v2.



More information about the dev mailing list