[dpdk-dev] [PATCH v4 2/8] net/mlx5: add Tx datapath related devargs

Yongseok Koh yskoh at mellanox.com
Mon Jul 22 07:32:46 CEST 2019


> On Jul 21, 2019, at 7:24 AM, Viacheslav Ovsiienko <viacheslavo at mellanox.com> wrote:
> 
> This patch introduces new mlx5 PMD devarg options:
> 
> - txq_inline_min - specifies minimal amount of data to be inlined into
>  WQE during Tx operations. NICs may require this minimal data amount
>  to operate correctly. The exact value may depend on NIC operation mode,
>  requested offloads, etc.
> 
> - txq_inline_max - specifies the maximal packet length to be completely
>  inlined into WQE Ethernet Segment for ordinary SEND method. If packet
>  is larger the specified value, the packet data won't be copied by the
>  driver at all, data buffer is addressed with a pointer. If packet length
>  is less or equal all packet data will be copied into WQE.
> 
> - txq_inline_mpw - specifies the maximal packet length to be completely
>  inlined into WQE for Enhanced MPW method.
> 
> Driver documentation is also updated.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo at mellanox.com>
> ---

Acked-by: Yongseok Koh <yskoh at mellanox.com>

> doc/guides/nics/mlx5.rst               | 155 +++++++++++++++++++++++----------
> doc/guides/rel_notes/release_19_08.rst |   2 +
> drivers/net/mlx5/mlx5.c                |  29 +++++-
> drivers/net/mlx5/mlx5.h                |   4 +
> 4 files changed, 140 insertions(+), 50 deletions(-)
> 
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> index 5cf1e76..7e87344 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -351,24 +351,102 @@ Run-time configuration
> - ``txq_inline`` parameter [int]
> 
>   Amount of data to be inlined during TX operations. This parameter is
> -  deprecated and ignored, kept for compatibility issue.
> +  deprecated and converted to the new parameter ``txq_inline_max`` providing
> +  partial compatibility.
> 
> - ``txqs_min_inline`` parameter [int]
> 
> -  Enable inline send only when the number of TX queues is greater or equal
> +  Enable inline data send only when the number of TX queues is greater or equal
>   to this value.
> 
> -  This option should be used in combination with ``txq_inline`` above.
> -
> -  On ConnectX-4, ConnectX-4 LX, ConnectX-5, ConnectX-6 and BlueField without
> -  Enhanced MPW:
> -
> -        - Disabled by default.
> -        - In case ``txq_inline`` is set recommendation is 4.
> -
> -  On ConnectX-5, ConnectX-6 and BlueField with Enhanced MPW:
> -
> -        - Set to 8 by default.
> +  This option should be used in combination with ``txq_inline_max`` and
> +  ``txq_inline_mpw`` below and does not affect ``txq_inline_min`` settings above.
> +
> +  If this option is not specified the default value 16 is used for BlueField
> +  and 8 for other platforms
> +
> +  The data inlining consumes the CPU cycles, so this option is intended to
> +  auto enable inline data if we have enough Tx queues, which means we have
> +  enough CPU cores and PCI bandwidth is getting more critical and CPU
> +  is not supposed to be bottleneck anymore.
> +
> +  The copying data into WQE improves latency and can improve PPS performance
> +  when PCI back pressure is detected and may be useful for scenarios involving
> +  heavy traffic on many queues.
> +
> +  Because additional software logic is necessary to handle this mode, this
> +  option should be used with care, as it may lower performance when back
> +  pressure is not expected.
> +
> +- ``txq_inline_min`` parameter [int]
> +
> +  Minimal amount of data to be inlined into WQE during Tx operations. NICs
> +  may require this minimal data amount to operate correctly. The exact value
> +  may depend on NIC operation mode, requested offloads, etc.
> +
> +  If ``txq_inline_min`` key is present the specified value (may be aligned
> +  by the driver in order not to exceed the limits and provide better descriptor
> +  space utilization) will be used by the driver and it is guaranteed the
> +  requested data bytes are inlined into the WQE beside other inline settings.
> +  This keys also may update ``txq_inline_max`` value (default of specified
> +  explicitly in devargs) to reserve the space for inline data.
> +
> +  If ``txq_inline_min`` key is not present, the value may be queried by the
> +  driver from the NIC via DevX if this feature is available. If there is no DevX
> +  enabled/supported the value 18 (supposing L2 header including VLAN) is set
> +  for ConnectX-4, value 58 (supposing L2-L4 headers, required by configurations
> +  over E-Switch) is set for ConnectX-4 Lx, and 0 is set by default for ConnectX-5
> +  and newer NICs. If packet is shorter the ``txq_inline_min`` value, the entire
> +  packet is inlined.
> +
> +  For the ConnectX-4 and ConnectX-4 Lx NICs driver does not allow to set
> +  this value below 18 (minimal L2 header, including VLAN).
> +
> +  Please, note, this minimal data inlining disengages eMPW feature (Enhanced
> +  Multi-Packet Write), because last one does not support partial packet inlining.
> +  This is not very critical due to minimal data inlining is mostly required
> +  by ConnectX-4 and ConnectX-4 Lx, these NICs do not support eMPW feature.
> +
> +- ``txq_inline_max`` parameter [int]
> +
> +  Specifies the maximal packet length to be completely inlined into WQE
> +  Ethernet Segment for ordinary SEND method. If packet is larger than specified
> +  value, the packet data won't be copied by the driver at all, data buffer
> +  is addressed with a pointer. If packet length is less or equal all packet
> +  data will be copied into WQE. This may improve PCI bandwidth utilization for
> +  short packets significantly but requires the extra CPU cycles.
> +
> +  The data inline feature is controlled by number of Tx queues, if number of Tx
> +  queues is larger than ``txqs_min_inline`` key parameter, the inline feature
> +  is engaged, if there are not enough Tx queues (which means not enough CPU cores
> +  and CPU resources are scarce), data inline is not performed by the driver.
> +  Assigning ``txqs_min_inline`` with zero always enables the data inline.
> +
> +  The default ``txq_inline_max`` value is 290. The specified value may be adjusted
> +  by the driver in order not to exceed the limit (930 bytes) and to provide better
> +  WQE space filling without gaps, the adjustment is reflected in the debug log.
> +
> +- ``txq_inline_mpw`` parameter [int]
> +
> +  Specifies the maximal packet length to be completely inlined into WQE for
> +  Enhanced MPW method. If packet is large the specified value, the packet data
> +  won't be copied, and data buffer is addressed with pointer. If packet length
> +  is less or equal, all packet data will be copied into WQE. This may improve PCI
> +  bandwidth utilization for short packets significantly but requires the extra
> +  CPU cycles.
> +
> +  The data inline feature is controlled by number of TX queues, if number of Tx
> +  queues is larger than ``txqs_min_inline`` key parameter, the inline feature
> +  is engaged, if there are not enough Tx queues (which means not enough CPU cores
> +  and CPU resources are scarce), data inline is not performed by the driver.
> +  Assigning ``txqs_min_inline`` with zero always enables the data inline.
> +
> +  The default ``txq_inline_mpw`` value is 188. The specified value may be adjusted
> +  by the driver in order not to exceed the limit (930 bytes) and to provide better
> +  WQE space filling without gaps, the adjustment is reflected in the debug log.
> +  Due to multiple packets may be included to the same WQE with Enhanced Multi
> +  Packet Write Method and overall WQE size is limited it is not recommended to
> +  specify large values for the ``txq_inline_mpw``.
> 
> - ``txqs_max_vec`` parameter [int]
> 
> @@ -376,47 +454,34 @@ Run-time configuration
>   equal to this value. This parameter is deprecated and ignored, kept
>   for compatibility issue to not prevent driver from probing.
> 
> -- ``txq_mpw_en`` parameter [int]
> -
> -  A nonzero value enables multi-packet send (MPS) for ConnectX-4 Lx and
> -  enhanced multi-packet send (Enhanced MPS) for ConnectX-5, ConnectX-6 and BlueField.
> -  MPS allows the TX burst function to pack up multiple packets in a
> -  single descriptor session in order to save PCI bandwidth and improve
> -  performance at the cost of a slightly higher CPU usage. When
> -  ``txq_inline`` is set along with ``txq_mpw_en``, TX burst function tries
> -  to copy entire packet data on to TX descriptor instead of including
> -  pointer of packet only if there is enough room remained in the
> -  descriptor. ``txq_inline`` sets per-descriptor space for either pointers
> -  or inlined packets. In addition, Enhanced MPS supports hybrid mode -
> -  mixing inlined packets and pointers in the same descriptor.
> -
> -  This option cannot be used with certain offloads such as ``DEV_TX_OFFLOAD_TCP_TSO,
> -  DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO, DEV_TX_OFFLOAD_VLAN_INSERT``.
> -  When those offloads are requested the MPS send function will not be used.
> -
> -  It is currently only supported on the ConnectX-4 Lx, ConnectX-5, ConnectX-6 and BlueField
> -  families of adapters.
> -  On ConnectX-4 Lx the MPW is considered un-secure hence disabled by default.
> -  Users which enable the MPW should be aware that application which provides incorrect
> -  mbuf descriptors in the Tx burst can lead to serious errors in the host including, on some cases,
> -  NIC to get stuck.
> -  On ConnectX-5, ConnectX-6 and BlueField the MPW is secure and enabled by default.
> -
> - ``txq_mpw_hdr_dseg_en`` parameter [int]
> 
>   A nonzero value enables including two pointers in the first block of TX
>   descriptor. The parameter is deprecated and ignored, kept for compatibility
>   issue.
> 
> -  Effective only when Enhanced MPS is supported. Disabled by default.
> -
> - ``txq_max_inline_len`` parameter [int]
> 
>   Maximum size of packet to be inlined. This limits the size of packet to
>   be inlined. If the size of a packet is larger than configured value, the
>   packet isn't inlined even though there's enough space remained in the
>   descriptor. Instead, the packet is included with pointer. This parameter
> -  is deprecated.
> +  is deprecated and converted directly to ``txq_inline_mpw`` providing full
> +  compatibility. Valid only if eMPW feature is engaged.
> +
> +- ``txq_mpw_en`` parameter [int]
> +
> +  A nonzero value enables Enhanced Multi-Packet Write (eMPW) for ConnectX-5,
> +  ConnectX-6 and BlueField. eMPW allows the TX burst function to pack up multiple
> +  packets in a single descriptor session in order to save PCI bandwidth and improve
> +  performance at the cost of a slightly higher CPU usage. When ``txq_inline_mpw``
> +  is set along with ``txq_mpw_en``, TX burst function copies entire packet
> +  data on to TX descriptor instead of including pointer of packet.
> +
> +  The Enhanced Multi-Packet Write feature is enabled by default if NIC supports
> +  it, can be disabled by explicit specifying 0 value for ``txq_mpw_en`` option.
> +  Also, if minimal data inlining is requested by non-zero ``txq_inline_min``
> +  option or reported by the NIC, the eMPW feature is disengaged.
> 
> - ``tx_vec_en`` parameter [int]
> 
> @@ -424,12 +489,6 @@ Run-time configuration
>   NICs if the number of global Tx queues on the port is less than
>   ``txqs_max_vec``. The parameter is deprecated and ignored.
> 
> -  This option cannot be used with certain offloads such as ``DEV_TX_OFFLOAD_TCP_TSO,
> -  DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO, DEV_TX_OFFLOAD_VLAN_INSERT``.
> -  When those offloads are requested the MPS send function will not be used.
> -
> -  Enabled by default on ConnectX-5, ConnectX-6 and BlueField.
> -
> - ``rx_vec_en`` parameter [int]
> 
>   A nonzero value enables Rx vector if the port is not configured in
> diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
> index 1bf9eb8..6c382cb 100644
> --- a/doc/guides/rel_notes/release_19_08.rst
> +++ b/doc/guides/rel_notes/release_19_08.rst
> @@ -116,6 +116,8 @@ New Features
>   * Added support for IP-in-IP tunnel.
>   * Accelerate flows with count action creation and destroy.
>   * Accelerate flows counter query.
> +  * Improve Tx datapath improves performance with enabled HW offloads.
> +
> 
> * **Updated Solarflare network PMD.**
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index d4f0eb2..bbf2583 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -72,6 +72,15 @@
> /* Device parameter to configure inline send. Deprecated, ignored.*/
> #define MLX5_TXQ_INLINE "txq_inline"
> 
> +/* Device parameter to limit packet size to inline with ordinary SEND. */
> +#define MLX5_TXQ_INLINE_MAX "txq_inline_max"
> +
> +/* Device parameter to configure minimal data size to inline. */
> +#define MLX5_TXQ_INLINE_MIN "txq_inline_min"
> +
> +/* Device parameter to limit packet size to inline with Enhanced MPW. */
> +#define MLX5_TXQ_INLINE_MPW "txq_inline_mpw"
> +
> /*
>  * Device parameter to configure the number of TX queues threshold for
>  * enabling inline send.
> @@ -1006,7 +1015,15 @@ struct mlx5_dev_spawn_data {
> 	} else if (strcmp(MLX5_RXQS_MIN_MPRQ, key) == 0) {
> 		config->mprq.min_rxqs_num = tmp;
> 	} else if (strcmp(MLX5_TXQ_INLINE, key) == 0) {
> -		DRV_LOG(WARNING, "%s: deprecated parameter, ignored", key);
> +		DRV_LOG(WARNING, "%s: deprecated parameter,"
> +				 " converted to txq_inline_max", key);
> +		config->txq_inline_max = tmp;
> +	} else if (strcmp(MLX5_TXQ_INLINE_MAX, key) == 0) {
> +		config->txq_inline_max = tmp;
> +	} else if (strcmp(MLX5_TXQ_INLINE_MIN, key) == 0) {
> +		config->txq_inline_min = tmp;
> +	} else if (strcmp(MLX5_TXQ_INLINE_MPW, key) == 0) {
> +		config->txq_inline_mpw = tmp;
> 	} else if (strcmp(MLX5_TXQS_MIN_INLINE, key) == 0) {
> 		config->txqs_inline = tmp;
> 	} else if (strcmp(MLX5_TXQS_MAX_VEC, key) == 0) {
> @@ -1016,7 +1033,9 @@ struct mlx5_dev_spawn_data {
> 	} else if (strcmp(MLX5_TXQ_MPW_HDR_DSEG_EN, key) == 0) {
> 		DRV_LOG(WARNING, "%s: deprecated parameter, ignored", key);
> 	} else if (strcmp(MLX5_TXQ_MAX_INLINE_LEN, key) == 0) {
> -		DRV_LOG(WARNING, "%s: deprecated parameter, ignored", key);
> +		DRV_LOG(WARNING, "%s: deprecated parameter,"
> +				 " converted to txq_inline_mpw", key);
> +		config->txq_inline_mpw = tmp;
> 	} else if (strcmp(MLX5_TX_VEC_EN, key) == 0) {
> 		DRV_LOG(WARNING, "%s: deprecated parameter, ignored", key);
> 	} else if (strcmp(MLX5_RX_VEC_EN, key) == 0) {
> @@ -1064,6 +1083,9 @@ struct mlx5_dev_spawn_data {
> 		MLX5_RX_MPRQ_MAX_MEMCPY_LEN,
> 		MLX5_RXQS_MIN_MPRQ,
> 		MLX5_TXQ_INLINE,
> +		MLX5_TXQ_INLINE_MIN,
> +		MLX5_TXQ_INLINE_MAX,
> +		MLX5_TXQ_INLINE_MPW,
> 		MLX5_TXQS_MIN_INLINE,
> 		MLX5_TXQS_MAX_VEC,
> 		MLX5_TXQ_MPW_EN,
> @@ -2026,6 +2048,9 @@ struct mlx5_dev_spawn_data {
> 		.hw_padding = 0,
> 		.mps = MLX5_ARG_UNSET,
> 		.rx_vec_en = 1,
> +		.txq_inline_max = MLX5_ARG_UNSET,
> +		.txq_inline_min = MLX5_ARG_UNSET,
> +		.txq_inline_mpw = MLX5_ARG_UNSET,
> 		.txqs_inline = MLX5_ARG_UNSET,
> 		.vf_nl_en = 1,
> 		.mr_ext_memseg_en = 1,
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index 354f6bc..86f005d 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -198,6 +198,7 @@ struct mlx5_dev_config {
> 	unsigned int cqe_comp:1; /* CQE compression is enabled. */
> 	unsigned int cqe_pad:1; /* CQE padding is enabled. */
> 	unsigned int tso:1; /* Whether TSO is supported. */
> +	unsigned int tx_inline:1; /* Engage TX data inlining. */
> 	unsigned int rx_vec_en:1; /* Rx vector is enabled. */
> 	unsigned int mr_ext_memseg_en:1;
> 	/* Whether memseg should be extended for MR creation. */
> @@ -223,6 +224,9 @@ struct mlx5_dev_config {
> 	unsigned int ind_table_max_size; /* Maximum indirection table size. */
> 	unsigned int max_dump_files_num; /* Maximum dump files per queue. */
> 	int txqs_inline; /* Queue number threshold for inlining. */
> +	int txq_inline_min; /* Minimal amount of data bytes to inline. */
> +	int txq_inline_max; /* Max packet size for inlining with SEND. */
> +	int txq_inline_mpw; /* Max packet size for inlining with eMPW. */
> 	struct mlx5_hca_attr hca_attr; /* HCA attributes. */
> };
> 
> -- 
> 1.8.3.1
> 



More information about the dev mailing list