[dpdk-dev] [PATCH v4 1/6] ethdev: fix max Rx packet length
    Ferruh Yigit 
    ferruh.yigit at intel.com
       
    Mon Oct 11 23:59:34 CEST 2021
    
    
  
On 10/10/2021 7:30 AM, Matan Azrad wrote:
> 
> Hi Ferruh
> 
> From: Ferruh Yigit
>> There is a confusion on setting max Rx packet length, this patch aims to
>> clarify it.
>>
>> 'rte_eth_dev_configure()' API accepts max Rx packet size via
>> 'uint32_t max_rx_pkt_len' field of the config struct 'struct
>> rte_eth_conf'.
>>
>> Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
>> stored into '(struct rte_eth_dev)->data->mtu'.
>>
>> These two APIs are related but they work in a disconnected way, they
>> store the set values in different variables which makes hard to figure
>> out which one to use, also having two different method for a related
>> functionality is confusing for the users.
>>
>> Other issues causing confusion is:
>> * maximum transmission unit (MTU) is payload of the Ethernet frame. And
>>    'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
>>    Ethernet frame overhead, and this overhead may be different from
>>    device to device based on what device supports, like VLAN and QinQ.
>> * 'max_rx_pkt_len' is only valid when application requested jumbo frame,
>>    which adds additional confusion and some APIs and PMDs already
>>    discards this documented behavior.
>> * For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
>>    field, this adds configuration complexity for application.
>>
>> As solution, both APIs gets MTU as parameter, and both saves the result
>> in same variable '(struct rte_eth_dev)->data->mtu'. For this
>> 'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
>> from jumbo frame.
>>
>> For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
>> request and it should be used only within configure function and result
>> should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
>> both application and PMD uses MTU from this variable.
>>
>> When application doesn't provide an MTU during 'rte_eth_dev_configure()'
>> default 'RTE_ETHER_MTU' value is used.
>>
>> Additional clarification done on scattered Rx configuration, in
>> relation to MTU and Rx buffer size.
>> MTU is used to configure the device for physical Rx/Tx size limitation,
>> Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer
>> size as Rx buffer size.
>> PMDs compare MTU against Rx buffer size to decide enabling scattered Rx
>> or not. If scattered Rx is not supported by device, MTU bigger than Rx
>> buffer size should fail.
> 
> Should it be compared also against max_lro_pkt_size for the SCATTER enabling by the PMD?
> 
I kept the LRO related code same, the Rx packet length change patch already become
complex, LRO related changes can be done later instead of making this set more confusing.
It would be great if you and Dekel can work on it as you introduced the 'max_lro_pkt_size' in ethdev.
> What do you think about enabling SCATTER by the API instead of making the comparison in each PMD?
> 
Not sure if we can do that, as far as I can see there is no enforcement on the
Rx buffer size but PMDs select it.
>> Signed-off-by: Ferruh Yigit <ferruh.yigit at intel.com>
> 
> <snip>
> 
> Please see more below regarding SCATTER.
>   
>> diff --git a/drivers/net/mlx4/mlx4_rxq.c b/drivers/net/mlx4/mlx4_rxq.c
>> index 978cbb8201ea..4a5cfd22aa71 100644
>> --- a/drivers/net/mlx4/mlx4_rxq.c
>> +++ b/drivers/net/mlx4/mlx4_rxq.c
>> @@ -753,6 +753,7 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev,
>> uint16_t idx, uint16_t desc,
>>          int ret;
>>          uint32_t crc_present;
>>          uint64_t offloads;
>> +       uint32_t max_rx_pktlen;
>>
>>          offloads = conf->offloads | dev->data->dev_conf.rxmode.offloads;
>>
>> @@ -828,13 +829,11 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev,
>> uint16_t idx, uint16_t desc,
>>          };
>>          /* Enable scattered packets support for this queue if necessary. */
>>          MLX4_ASSERT(mb_len >= RTE_PKTMBUF_HEADROOM);
>> -       if (dev->data->dev_conf.rxmode.max_rx_pkt_len <=
>> -           (mb_len - RTE_PKTMBUF_HEADROOM)) {
>> +       max_rx_pktlen = dev->data->mtu + RTE_ETHER_HDR_LEN +
>> RTE_ETHER_CRC_LEN;
>> +       if (max_rx_pktlen <= (mb_len - RTE_PKTMBUF_HEADROOM)) {
>>                  ;
>>          } else if (offloads & DEV_RX_OFFLOAD_SCATTER) {
>> -               uint32_t size =
>> -                       RTE_PKTMBUF_HEADROOM +
>> -                       dev->data->dev_conf.rxmode.max_rx_pkt_len;
>> +               uint32_t size = RTE_PKTMBUF_HEADROOM + max_rx_pktlen;
>>                  uint32_t sges_n;
>>
>>                  /*
>> @@ -846,21 +845,19 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev,
>> uint16_t idx, uint16_t desc,
>>                  /* Make sure sges_n did not overflow. */
>>                  size = mb_len * (1 << rxq->sges_n);
>>                  size -= RTE_PKTMBUF_HEADROOM;
>> -               if (size < dev->data->dev_conf.rxmode.max_rx_pkt_len) {
>> +               if (size < max_rx_pktlen) {
>>                          rte_errno = EOVERFLOW;
>>                          ERROR("%p: too many SGEs (%u) needed to handle"
>>                                " requested maximum packet size %u",
>>                                (void *)dev,
>> -                             1 << sges_n,
>> -                             dev->data->dev_conf.rxmode.max_rx_pkt_len);
>> +                             1 << sges_n, max_rx_pktlen);
>>                          goto error;
>>                  }
>>          } else {
>>                  WARN("%p: the requested maximum Rx packet size (%u) is"
>>                       " larger than a single mbuf (%u) and scattered"
>>                       " mode has not been requested",
>> -                    (void *)dev,
>> -                    dev->data->dev_conf.rxmode.max_rx_pkt_len,
>> +                    (void *)dev, max_rx_pktlen,
>>                       mb_len - RTE_PKTMBUF_HEADROOM);
>>          }
> 
> If, by definition, SCATTER should be enabled implicitly by the PMD according to the comparison you wrote above, maybe this check for SCATTER offload is not needed.
> 
This behavior is not documented and not clear, some PMDs enable scattered Rx
implicitly some doesn't.
It looks like we need a clarification patch for scattered Rx too.
For this patch I added scatter related info on the commit log to clarify the
reasoning of the change. PMD behavior not changed.
> Also, it can be documented on SCATTER offload precisely the parameters that are used for the comparison and that it is for capability only and no need to configure it.
> 
We are having same question in a few other offloads, should we take user
configuration strictly and fail, or should we adjust config to requested values.
Like if PMD supports scattered Rx and requested MTU is bigger than Rx buffer size,
should PMD enable scattered Rx itself or fails. We should first clarify this
and later fix documentation and driver in a separate patch.
> Also, for the case of multi RX mempools configuration, it can be implicitly understood by the PMDs to enable SCATTER and no need to check that in PMD/API.
> 
Yes, multi Rx mempools is something else to take into account for the scattered
Rx config.
> What do you think?
> 
>>          DEBUG("%p: maximum number of segments per packet: %u",
>> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
>> index abd8ce798986..6f4f351222d3 100644
>> --- a/drivers/net/mlx5/mlx5_rxq.c
>> +++ b/drivers/net/mlx5/mlx5_rxq.c
>> @@ -1330,10 +1330,11 @@ mlx5_rxq_new(struct rte_eth_dev *dev,
>> uint16_t idx, uint16_t desc,
>>          uint64_t offloads = conf->offloads |
>>                             dev->data->dev_conf.rxmode.offloads;
>>          unsigned int lro_on_queue = !!(offloads &
>> DEV_RX_OFFLOAD_TCP_LRO);
>> -       unsigned int max_rx_pkt_len = lro_on_queue ?
>> +       unsigned int max_rx_pktlen = lro_on_queue ?
>>                          dev->data->dev_conf.rxmode.max_lro_pkt_size :
>> -                       dev->data->dev_conf.rxmode.max_rx_pkt_len;
>> -       unsigned int non_scatter_min_mbuf_size = max_rx_pkt_len +
>> +                       dev->data->mtu + (unsigned int)RTE_ETHER_HDR_LEN +
>> +                               RTE_ETHER_CRC_LEN;
>> +       unsigned int non_scatter_min_mbuf_size = max_rx_pktlen +
>>                                                          RTE_PKTMBUF_HEADROOM;
>>          unsigned int max_lro_size = 0;
>>          unsigned int first_mb_free_size = mb_len - RTE_PKTMBUF_HEADROOM;
>> @@ -1372,7 +1373,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
>> idx, uint16_t desc,
>>           * needed to handle max size packets, replace zero length
>>           * with the buffer length from the pool.
>>           */
>> -       tail_len = max_rx_pkt_len;
>> +       tail_len = max_rx_pktlen;
>>          do {
>>                  struct mlx5_eth_rxseg *hw_seg =
>>                                          &tmpl->rxq.rxseg[tmpl->rxq.rxseg_n];
>> @@ -1410,7 +1411,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
>> idx, uint16_t desc,
>>                                  "port %u too many SGEs (%u) needed to handle"
>>                                  " requested maximum packet size %u, the maximum"
>>                                  " supported are %u", dev->data->port_id,
>> -                               tmpl->rxq.rxseg_n, max_rx_pkt_len,
>> +                               tmpl->rxq.rxseg_n, max_rx_pktlen,
>>                                  MLX5_MAX_RXQ_NSEG);
>>                          rte_errno = ENOTSUP;
>>                          goto error;
>> @@ -1435,7 +1436,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
>> idx, uint16_t desc,
>>                  DRV_LOG(ERR, "port %u Rx queue %u: Scatter offload is not"
>>                          " configured and no enough mbuf space(%u) to contain "
>>                          "the maximum RX packet length(%u) with head-room(%u)",
>> -                       dev->data->port_id, idx, mb_len, max_rx_pkt_len,
>> +                       dev->data->port_id, idx, mb_len, max_rx_pktlen,
>>                          RTE_PKTMBUF_HEADROOM);
>>                  rte_errno = ENOSPC;
>>                  goto error;
> 
> Also, here for the SCATTER check. Here, it is even an error.
> 
>> @@ -1454,7 +1455,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
>> idx, uint16_t desc,
>>           * following conditions are met:
>>           *  - MPRQ is enabled.
>>           *  - The number of descs is more than the number of strides.
>> -        *  - max_rx_pkt_len plus overhead is less than the max size
>> +        *  - max_rx_pktlen plus overhead is less than the max size
>>           *    of a stride or mprq_stride_size is specified by a user.
>>           *    Need to make sure that there are enough strides to encap
>>           *    the maximum packet size in case mprq_stride_size is set.
>> @@ -1478,7 +1479,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
>> idx, uint16_t desc,
>>                                  !!(offloads & DEV_RX_OFFLOAD_SCATTER);
>>                  tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size,
>>                                  config->mprq.max_memcpy_len);
>> -               max_lro_size = RTE_MIN(max_rx_pkt_len,
>> +               max_lro_size = RTE_MIN(max_rx_pktlen,
>>                                         (1u << tmpl->rxq.strd_num_n) *
>>                                         (1u << tmpl->rxq.strd_sz_n));
>>                  DRV_LOG(DEBUG,
>> @@ -1487,9 +1488,9 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
>> idx, uint16_t desc,
>>                          dev->data->port_id, idx,
>>                          tmpl->rxq.strd_num_n, tmpl->rxq.strd_sz_n);
>>          } else if (tmpl->rxq.rxseg_n == 1) {
>> -               MLX5_ASSERT(max_rx_pkt_len <= first_mb_free_size);
>> +               MLX5_ASSERT(max_rx_pktlen <= first_mb_free_size);
>>                  tmpl->rxq.sges_n = 0;
>> -               max_lro_size = max_rx_pkt_len;
>> +               max_lro_size = max_rx_pktlen;
>>          } else if (offloads & DEV_RX_OFFLOAD_SCATTER) {
>>                  unsigned int sges_n;
>>
>> @@ -1511,13 +1512,13 @@ mlx5_rxq_new(struct rte_eth_dev *dev,
>> uint16_t idx, uint16_t desc,
>>                                  "port %u too many SGEs (%u) needed to handle"
>>                                  " requested maximum packet size %u, the maximum"
>>                                  " supported are %u", dev->data->port_id,
>> -                               1 << sges_n, max_rx_pkt_len,
>> +                               1 << sges_n, max_rx_pktlen,
>>                                  1u << MLX5_MAX_LOG_RQ_SEGS);
>>                          rte_errno = ENOTSUP;
>>                          goto error;
>>                  }
>>                  tmpl->rxq.sges_n = sges_n;
>> -               max_lro_size = max_rx_pkt_len;
>> +               max_lro_size = max_rx_pktlen;
>>          }
>>          if (config->mprq.enabled && !mlx5_rxq_mprq_enabled(&tmpl->rxq))
>>                  DRV_LOG(WARNING,
> 
> <snip>
> 
    
    
More information about the dev
mailing list