[dpdk-dev] [PATCH v2 2/2] net/mlx5: enforce Tx num of segments limitation

Nélio Laranjeiro nelio.laranjeiro at 6wind.com
Mon Sep 4 16:57:04 CEST 2017


On Wed, Aug 30, 2017 at 10:07:08AM +0300, Shahaf Shuler wrote:
> Mellanox NICs has a limitation on the number of mbuf segments a multi
> segment mbuf can have. The max number depends on the Tx offloads requested.
> 
> The current code not enforce such limitation, which might cause
> malformed work requests to be written to the device.
> 
> This commit adds verification for the number of mbuf segments posted
> to the device. In case of overflow the packet will not be sent.
> Debug prints were added to help application identify the cause for such
> case.
> 
> In addition update the nic documentation with the limitation.
> Considering device limitation is 63 data segments in a work request, the
> maximum number of segment in mbuf was calculated taking TSO as the worst
> case:
> 
> max_nb_segs = 63 - (control_segment + ethernet segment +
> 		    TSO headers inline + inline segment +
> 		    extra inline to align to cacheline)
> 
> Cc: stable at dpdk.org
> 
> Signed-off-by: Shahaf Shuler <shahafs at mellanox.com>
> ---
> This patch should be applied only after the series:
> http://dpdk.org/dev/patchwork/patch/27367/
> 
> on v2:
>  - remove parenthesis around MLX5_MAX_DS.
>  - add limitation to nic guide.
>  - update commit message.
>  - fix typo.
> ---
>  doc/guides/nics/mlx5.rst             |  2 ++
>  drivers/net/mlx5/mlx5_defs.h         |  3 ++-
>  drivers/net/mlx5/mlx5_prm.h          |  3 +++
>  drivers/net/mlx5/mlx5_rxtx.c         | 30 +++++++++++++++++++++++++++---
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.c |  8 ++++++++
>  drivers/net/mlx5/mlx5_txq.c          | 27 +++++++++++++++++++++++++++
>  6 files changed, 69 insertions(+), 4 deletions(-)
> 
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> index f4cb18bca..d8244de97 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -124,6 +124,8 @@ Limitations
>  
>    Will match any ipv4 packet (VLAN included).
>  
> +- A multi segment mbuf must have less than 50 segments. That means mbuf->nb_segs < 50.
> +
>  Configuration
>  -------------
>  
> diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
> index a76bc6f65..3de0e5d81 100644
> --- a/drivers/net/mlx5/mlx5_defs.h
> +++ b/drivers/net/mlx5/mlx5_defs.h
> @@ -100,7 +100,8 @@
>  
>  /*
>   * Maximum size of burst for vectorized Tx. This is related to the maximum size
> - * of Enhaned MPW (eMPW) WQE as vectorized Tx is supported with eMPW.
> + * of Enhanced MPW (eMPW) WQE as vectorized Tx is supported with eMPW.
> + * Careful when changing, large value can cause wqe DS to overlap.
>   */
>  #define MLX5_VPMD_TX_MAX_BURST        32U
>  
> diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
> index 608072f7e..bc2b72333 100644
> --- a/drivers/net/mlx5/mlx5_prm.h
> +++ b/drivers/net/mlx5/mlx5_prm.h
> @@ -154,6 +154,9 @@
>  /* Default mark value used when none is provided. */
>  #define MLX5_FLOW_MARK_DEFAULT 0xffffff
>  
> +/* Maximum number of DS in WQE. */
> +#define MLX5_MAX_DS 63
> +
>  /* Subset of struct mlx5_wqe_eth_seg. */
>  struct mlx5_wqe_eth_seg_small {
>  	uint32_t rsvd0;
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index fe9e7eac0..d7aa382f9 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -657,6 +657,15 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
>  		else
>  			j += sg;
>  next_pkt:
> +		if (ds > MLX5_MAX_DS) {
> +#ifndef NDEBUG
> +			WARN("Cannot send packet %p with %d segments "
> +			     "wqe.ds = %d, wqe.inline_sz = %d",
> +			     (void *)pkts, (*pkts)->nb_segs, ds,
> +			     pkt_inline_sz);
> +#endif

I will say no for those series of warning in the datapath, I understand the
need of such informations, but it will reduce considerably the throughput in
some cases where the issue is not related to this.
Datapath must also be optimised also in debug mode.

I suggestion would be to change the burst API to return an int with correct
errors values.

Currently oerrors can be increased in such situation to provide more
information to the user.

Thanks, 

-- 
Nélio Laranjeiro
6WIND


More information about the dev mailing list