[dpdk-dev] [PATCH v2] doc: fix update release notes for Mellanox drivers

Shahaf Shuler shahafs at mellanox.com
Mon May 13 07:26:22 CEST 2019


Hi Ori,

See some comments below. 

Sunday, May 12, 2019 11:33 PM, Ori Kam:
> Subject: [dpdk-dev] [PATCH v2] doc: fix update release notes for Mellanox
> drivers
> 
> This patch adds some missing features to Mellanox drivers release notes.
> It also updates the mlx5/mlx4 documentations.
> 
> Fixes: d85b204b5dba ("doc: update release notes for Mellanox drivers")
> Cc: yskoh at mellanox.com
> 
> Signed-off-by: Ori Kam <orika at mellanox.com>
> 
> ---
> 
> v2:
>  * fix checkpatch warning.
> 
> ---
>  doc/guides/nics/mlx4.rst               |   2 +-
>  doc/guides/nics/mlx5.rst               | 150 ++++++++++++++++++++++-----------
>  doc/guides/rel_notes/release_19_05.rst |  10 ++-
>  3 files changed, 109 insertions(+), 53 deletions(-)
> 
> diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst index
> f6d7a16..5c6bbde 100644
> --- a/doc/guides/nics/mlx4.rst
> +++ b/doc/guides/nics/mlx4.rst
> @@ -253,7 +253,7 @@ thanks to these environment variables:
>  Mellanox OFED as a fallback
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> -- `Mellanox OFED`_ version: **4.4, 4.5**.
> +- `Mellanox OFED`_ version: **4.4, 4.5, 4.6**.
>  - firmware version: **2.42.5000** and above.
> 
>  .. _`Mellanox OFED`:
> http://www.mellanox.com/page/products_dyn?product_family=26&mtag=li
> nux_sw_drivers
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index
> 325e9f6..5cf7744 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -7,7 +7,7 @@ MLX5 poll mode driver
> 
>  The MLX5 poll mode driver library (**librte_pmd_mlx5**) provides support
> for **Mellanox ConnectX-4**, **Mellanox ConnectX-4 Lx** , **Mellanox -
> ConnectX-5**, **Mellanox ConnectX-6** and **Mellanox Bluefield**
> families
> +ConnectX-5**, **Mellanox ConnectX-6** and **Mellanox BlueField**
> +families
>  of 10/25/40/50/100/200 Gb/s adapters as well as their virtual functions (VF)
> in SR-IOV context.
> 
> @@ -62,8 +62,8 @@ Features
>  - RX VLAN stripping.
>  - TX VLAN insertion.
>  - RX CRC stripping configuration.
> -- Promiscuous mode.
> -- Multicast promiscuous mode.
> +- Promiscuous mode on PF and VF.
> +- Multicast promiscuous mode on PF and VF.
>  - Hardware checksum offloads.
>  - Flow director (RTE_FDIR_MODE_PERFECT,
> RTE_FDIR_MODE_PERFECT_MAC_VLAN and
>    RTE_ETH_FDIR_REJECT).
> @@ -78,6 +78,11 @@ Features
>  - Rx HW timestamp.
>  - Tunnel types: VXLAN, L3 VXLAN, VXLAN-GPE, GRE, MPLSoGRE, MPLSoUDP.
>  - Tunnel HW offloads: packet type, inner/outer RSS, IP and UDP checksum
> verification.
> +- Nic HW offloads: encapsulation (vxlan, gre, mplsoudp, mplsogre), NAT,
> +routing, TTL
> +  increment/decrement, count, drop, mark. For details please see
> :ref:`Supported hardware offloads using rte_flow API`.
> +- Flow insertion rate of more then million flows per second. When using
> Direct Rules.

Replace '.' with ',' .

> +- Support groups.

Support for multiple rte_flow groups.

> +- Support raw encapsulation offload.

Can't it be part of the encapsulation part above?

> 
>  Limitations
>  -----------
> @@ -112,8 +117,6 @@ Limitations
>    is set to multi-packet send or Enhanced multi-packet send. Otherwise it
> must have
>    less than 50 segments.
> 
> -- Count action for RTE flow is **only supported in Mellanox OFED**.
> -
>  - Flows with a VXLAN Network Identifier equal (or ends to be equal)
>    to 0 are not supported.
> 
> @@ -124,13 +127,6 @@ Limitations
>  - VF: flow rules created on VF devices can only match traffic targeted at the
>    configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``).
> 
> -.. note::
> -
> -   MAC addresses not already present in the bridge table of the associated
> -   kernel network device will be added and cleaned up by the PMD when
> closing
> -   the device. In case of ungraceful program termination, some entries may
> -   remain present and should be removed manually by other means.
> -

This is still correct, why removed?

>  - When Multi-Packet Rx queue is configured (``mprq_en``), a Rx packet can
> be
>    externally attached to a user-provided mbuf with having
> EXT_ATTACHED_MBUF in
>    ol_flags. As the mempool for the external buffer is managed by PMD, all
> the @@ -147,30 +143,18 @@ Limitations
>    To receive IPv6 Multicast messages on VM, explicitly set the relevant
>    MAC address using rte_eth_dev_mac_addr_add() API.
> 
> -- E-Switch VXLAN tunnel is not supported together with outer VLAN.
> -
> -- E-Switch Flows with VNI pattern must include the VXLAN decapsulation
> action.
> -
> -- E-Switch VXLAN decapsulation Flow:
> +- E-Switch decapsulation Flow:
> 
>    - can be applied to PF port only.
>    - must specify VF port action (packet redirection from PF to VF).
> -  - must specify tunnel outer UDP local (destination) port, wildcards not
> allowed.
> -  - must specify tunnel outer VNI, wildcards not allowed.
> -  - must specify tunnel outer local (destination)  IPv4 or IPv6 address,
> wildcards not allowed.
> -  - optionally may specify tunnel outer remote (source) IPv4 or IPv6,
> wildcards or group IPs allowed.
>    - optionally may specify tunnel inner source and destination MAC
> addresses.
> 
> -- E-Switch VXLAN encapsulation Flow:
> +- E-Switch  encapsulation Flow:
> 
>    - can be applied to VF ports only.
>    - must specify PF port action (packet redirection from VF to PF).
> -  - must specify the VXLAN item with tunnel outer parameters.
> -  - must specify the tunnel outer VNI in the VXLAN item.
> -  - must specify the tunnel outer remote (destination) UDP port in the
> VXLAN item.
> -  - must specify the tunnel outer local (source) IPv4 or IPv6 in the , this
> address will locally (with scope link) assigned to the outer network interface,
> wildcards not allowed.
> -  - must specify the tunnel outer remote (destination) IPv4 or IPv6 in the
> VXLAN item, group IPs allowed.
> -  - must specify the tunnel outer destination MAC address in the VXLAN
> item, this address will be used to create neigh rule.
> +
> +- Groups, E-Switch steering, fast inseration rate are supported only with
> Mellanox OFED 4.6.2 and above.

I think the right place for such limitations is the table below (Supported hardware offloads using rte_flow API). 

> 
>  Statistics
>  ----------
> @@ -227,7 +211,7 @@ These options can be modified in the ``.config`` file.
> 
>  .. note::
> 
> -   For Bluefield, target should be set to ``arm64-bluefield-linux-gcc``. This
> +   For BlueField, target should be set to
> + ``arm64-bluefield-linux-gcc``. This
>     will enable ``CONFIG_RTE_LIBRTE_MLX5_PMD`` and set
> ``RTE_CACHE_LINE_SIZE`` to
>     64. Default armv8a configuration of make build and meson build set it to
> 128
>     then brings performance degradation.
> @@ -277,8 +261,8 @@ Run-time configuration
> 
>    Supported on:
> 
> -  - x86_64 with ConnectX-4, ConnectX-4 LX, ConnectX-5, ConnectX-6 and
> Bluefield.
> -  - POWER8 and ARMv8 with ConnectX-4 LX, ConnectX-5, ConnectX-6 and
> Bluefield.
> +  - x86_64 with ConnectX-4, ConnectX-4 LX, ConnectX-5, ConnectX-6 and
> BlueField.
> +  - POWER9 and ARMv8 with ConnectX-4 LX, ConnectX-5, ConnectX-6 and
> BlueField.
> 
>  - ``rxq_cqe_pad_en`` parameter [int]
> 
> @@ -296,7 +280,7 @@ Run-time configuration
> 
>    Supported on:
> 
> -  - CPU having 128B cacheline with ConnectX-5 and Bluefield.
> +  - CPU having 128B cacheline with ConnectX-5 and BlueField.
> 
>  - ``rxq_pkt_pad_en`` parameter [int]
> 
> @@ -308,8 +292,8 @@ Run-time configuration
> 
>    Supported on:
> 
> -  - x86_64 with ConnectX-4, ConnectX-4 LX, ConnectX-5, ConnectX-6 and
> Bluefield.
> -  - POWER8 and ARMv8 with ConnectX-4 LX, ConnectX-5, ConnectX-6 and
> Bluefield.
> +  - x86_64 with ConnectX-4, ConnectX-4 LX, ConnectX-5, ConnectX-6 and
> BlueField.
> +  - POWER8 and ARMv8 with ConnectX-4 LX, ConnectX-5, ConnectX-6 and
> BlueField.
> 
>  - ``mprq_en`` parameter [int]
> 
> @@ -375,13 +359,13 @@ Run-time configuration
> 
>    This option should be used in combination with ``txq_inline`` above.
> 
> -  On ConnectX-4, ConnectX-4 LX, ConnectX-5, ConnectX-6 and Bluefield
> without
> +  On ConnectX-4, ConnectX-4 LX, ConnectX-5, ConnectX-6 and BlueField
> + without
>    Enhanced MPW:
> 
>          - Disabled by default.
>          - In case ``txq_inline`` is set recommendation is 4.
> 
> -  On ConnectX-5, ConnectX-6 and Bluefield with Enhanced MPW:
> +  On ConnectX-5, ConnectX-6 and BlueField with Enhanced MPW:
> 
>          - Set to 8 by default.
> 
> @@ -395,14 +379,14 @@ Run-time configuration
>          - Set to 8 by default on ARMv8.
>          - Set to 4 by default otherwise.
> 
> -  On Bluefield
> +  On BlueField
> 
>          - Set to 16 by default.
> 
>  - ``txq_mpw_en`` parameter [int]
> 
>    A nonzero value enables multi-packet send (MPS) for ConnectX-4 Lx and
> -  enhanced multi-packet send (Enhanced MPS) for ConnectX-5, ConnectX-6
> and Bluefield.
> +  enhanced multi-packet send (Enhanced MPS) for ConnectX-5, ConnectX-6
> and BlueField.
>    MPS allows the TX burst function to pack up multiple packets in a
>    single descriptor session in order to save PCI bandwidth and improve
>    performance at the cost of a slightly higher CPU usage. When @@ -417,13
> +401,13 @@ Run-time configuration
>    DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO,
> DEV_TX_OFFLOAD_VLAN_INSERT``.
>    When those offloads are requested the MPS send function will not be
> used.
> 
> -  It is currently only supported on the ConnectX-4 Lx, ConnectX-5, ConnectX-
> 6 and Bluefield
> +  It is currently only supported on the ConnectX-4 Lx, ConnectX-5,
> + ConnectX-6 and BlueField
>    families of adapters.
>    On ConnectX-4 Lx the MPW is considered un-secure hence disabled by
> default.
>    Users which enable the MPW should be aware that application which
> provides incorrect
>    mbuf descriptors in the Tx burst can lead to serious errors in the host
> including, on some cases,
>    NIC to get stuck.
> -  On ConnectX-5, ConnectX-6 and Bluefield the MPW is secure and enabled
> by default.
> +  On ConnectX-5, ConnectX-6 and BlueField the MPW is secure and enabled
> by default.
> 
>  - ``txq_mpw_hdr_dseg_en`` parameter [int]
> 
> @@ -443,14 +427,14 @@ Run-time configuration
> 
>  - ``tx_vec_en`` parameter [int]
> 
> -  A nonzero value enables Tx vector on ConnectX-5, ConnectX-6 and
> Bluefield NICs if the number of
> +  A nonzero value enables Tx vector on ConnectX-5, ConnectX-6 and
> + BlueField NICs if the number of
>    global Tx queues on the port is less than ``txqs_max_vec``.
> 
>    This option cannot be used with certain offloads such as
> ``DEV_TX_OFFLOAD_TCP_TSO,
>    DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO,
> DEV_TX_OFFLOAD_VLAN_INSERT``.
>    When those offloads are requested the MPS send function will not be
> used.
> 
> -  Enabled by default on ConnectX-5, ConnectX-6 and Bluefield.
> +  Enabled by default on ConnectX-5, ConnectX-6 and BlueField.
> 
>  - ``rx_vec_en`` parameter [int]
> 
> @@ -480,10 +464,15 @@ Run-time configuration
> 
>    A nonzero value enables the DV flow steering assuming it is supported
>    by the driver.
> -  The DV flow steering is not supported on switchdev mode.
> 
>    Disabled by default.
> 
> +- ``dv_esw_en`` parameter [int]
> +
> +  A nonzero value enables E-Switch using Direct Rules.
> +
> +  Enabled by default if supported.
> +
>  - ``mr_ext_memseg_en`` parameter [int]
> 
>    A nonzero value enables extending memseg when registering DMA
> memory. If @@ -545,7 +534,7 @@ DPDK and must be installed separately:
>  - **libmlx5**
> 
>    Low-level user space driver library for Mellanox
> -  ConnectX-4/ConnectX-5/ConnectX-6/Bluefield devices, it is automatically
> loaded
> +  ConnectX-4/ConnectX-5/ConnectX-6/BlueField devices, it is
> + automatically loaded
>    by libibverbs.
> 
>    This library basically implements send/receive calls to the hardware @@ -
> 567,7 +556,7 @@ DPDK and must be installed separately:
>    their devices:
> 
>    - mlx5_core: hardware driver managing Mellanox
> -    ConnectX-4/ConnectX-5/ConnectX-6/Bluefield devices and related
> Ethernet kernel
> +    ConnectX-4/ConnectX-5/ConnectX-6/BlueField devices and related
> + Ethernet kernel
>      network devices.
>    - mlx5_ib: InifiniBand device driver.
>    - ib_uverbs: user space driver for Verbs (entry point for libibverbs).
> @@ -575,7 +564,7 @@ DPDK and must be installed separately:
>  - **Firmware update**
> 
>    Mellanox OFED/EN releases include firmware updates for
> -  ConnectX-4/ConnectX-5/ConnectX-6/Bluefield adapters.
> +  ConnectX-4/ConnectX-5/ConnectX-6/BlueField adapters.
> 
>    Because each release provides new features, these updates must be
> applied to
>    match the kernel modules and libraries they come with.
> @@ -595,7 +584,7 @@ releases.
>  RDMA Core with Linux Kernel
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> -- Minimal kernel version : v4.14 or the most recent 4.14-rc (see `Linux
> installation documentation`_)
> +- Minimal kernel version : v4.14 or the most recent 5.1.0-rc (see
> +`Linux installation documentation`_)
>  - Minimal rdma-core version: v15+ commit 0c5f5765213a ("Merge pull
> request #227 from yishaih/tm")
>    (see `RDMA Core installation documentation`_)
>  - When building for i686 use:
> @@ -622,7 +611,8 @@ thanks to these environment variables:
>  Mellanox OFED/EN
>  ^^^^^^^^^^^^^^^^
> 
> -- Mellanox OFED version: **4.4, 4.5** / Mellanox EN version: **4.5**
> +- Mellanox OFED version: ** 4.5, 4.6** /
> +  Mellanox EN version: **4.5, 4.6**
>  - firmware version:
> 
>    - ConnectX-4: **12.21.1000** and above.
> @@ -630,7 +620,7 @@ Mellanox OFED/EN
>    - ConnectX-5: **16.21.1000** and above.
>    - ConnectX-5 Ex: **16.21.1000** and above.
>    - ConnectX-6: **20.99.5374** and above.
> -  - Bluefield: **18.99.3950** and above.
> +  - BlueField: **18.25.1010** and above.
> 
>  While these libraries and kernel modules are available on OpenFabrics
> Alliance's `website
> <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
> w.openfabrics.org%2F&data=02%7C01%7Cshahafs%40mellanox.com%7
> C6432343b75294926c94c08d6d71918dc%7Ca652971c7d2e4d9ba6a4d149256f4
> 61b%7C0%7C0%7C636932900128053421&sdata=d8coRBxwMI62n%2FB1I
> E4mOcoVHs4lsiDl7kSgVljJBeY%3D&reserved=0>`__ and provided by
> package @@ -766,6 +756,34 @@ Quick Start Guide on OFED/EN  6. Compile
> DPDK and you are ready to go. See instructions on
>     :ref:`Development Kit Build System <Development_Kit_Build_System>`
> 
> +Enable E-Switch mode
> +--------------------

It is how to enable switchdev mode right? Not e-switch mode. 
Need to put a few words on what it is switchdev mode and why it is needed. 

> +
> +1. Enable SRIOV mode:
> +
> +  .. code-block:: console
> +
> +        mlxconfig -d <mst device> set SRIOV_EN=true
> +
> +2. Configure the number of VFs:

This is the max number of VFs.

> +
> +  .. code-block:: console
> +
> +        mlxconfig -d <mst device> set NUM_OF_VFS=<num of vfs>
> +        echo <num of vfs > /sys/class/net/<net
> + device>/device/sriov_numvfs

Need to reset FW after it.

Also - where do you create the actual VFs?

> +
> +3. Unbind the device (can be rebind after the switchdev mode):
> +
> +  .. code-block:: console
> +
> +        echo -n "<device pci address" >
> + /sys/bus/pci/drivers/mlx5_core/unbind
> +
> +4. Enbale switchdev mode:
> +
> +  .. code-block:: console
> +
> +        echo switchdev > /sys/class/net/<net
> + device>/compat/devlink/mode
> +
>  Performance tuning
>  ------------------
> 
> @@ -842,6 +860,38 @@ Performance tuning
>     - Configure per-lcore cache when creating Mempools for packet buffer.
>     - Refrain from dynamically allocating/freeing memory in run-time.
> 
> +Supported hardware offloads using rte_flow API
> +----------------------------------------------
> +
> +.. _Supported hardware offloads using rte_flow API:
> +
> +.. table:: Supported hardware offloads using rte_flow API
> +
> +   +---------------+-----------+-------------+-----------------+----------------+
> +   | Offload       | E-Switch  | NIC PF      | VF representor  | PF representor |
> +   |               | (SRIOV)   | (non SRIOV) | (SRIOV)         | (SRIOV)        |
> +
> +===============+===========+=============+===============
> ==+================+
> +   | Encapsulation |     V     |     V       |        V        |       V*       |
> +   +---------------+-----------+-------------+-----------------+----------------+
> +   | Multi table   |     V     |     V       |        V        |       V*       |
> +   | support       |           |             |                 |                |
> +   +---------------+-----------+-------------+-----------------+----------------+
> +   | NAT           |     V     |     V       |        V        |       V*       |
> +   +---------------+-----------+-------------+-----------------+----------------+
> +   | Routing       |     V     |     V       |        V        |       V*       |
> +   +---------------+-----------+-------------+-----------------+----------------+
> +   | TTL inc/dec   |     V     |     V       |        V        |       V*       |
> +   +---------------+-----------+-------------+-----------------+----------------+
> +   | Count         |     V     |     V       |        V        |       V*       |
> +   +---------------+-----------+-------------+-----------------+----------------+
> +   | Drop          |     V     |     V       |        V        |       V*       |
> +   +---------------+-----------+-------------+-----------------+----------------+
> +   | Mark          |    N/A    |     V       |        V        |       V*       |
> +
> + +---------------+-----------+-------------+-----------------+---------
> + -------+
> +
> +| V ConnectX-5
> +| V* ConnectX-5 & BlueField
> +

What is the difference between E-Switch (SRIOV), VF representor (SRIOV) and PF representor (SRIOV)? On switchdev mode every rule is either for the VF/uplink representor. 

The matrix is a bit more complex than what you show here. 
IMO-  the matrix has 2 column (E-Switch and NIC) and 2/3 dimensions (minimal NIC that support, minimal DPDK that support, minimal OFED/upstream kernel which support(?)).  
The columns need to be RTE_FLOW actions (let's leave the RTE_FLOW items aside for a moment). Because NAT, for example, may require different actions on different implementations. 


>  Notes for testpmd
>  -----------------
> 
> @@ -863,7 +913,7 @@ Usage example
>  -------------
> 
>  This section demonstrates how to launch **testpmd** with Mellanox -
> ConnectX-4/ConnectX-5/ConnectX-6/Bluefield devices managed by
> librte_pmd_mlx5.
> +ConnectX-4/ConnectX-5/ConnectX-6/BlueField devices managed by
> librte_pmd_mlx5.
> 
>  #. Load the kernel modules:
> 
> diff --git a/doc/guides/rel_notes/release_19_05.rst
> b/doc/guides/rel_notes/release_19_05.rst
> index 4e0eed5..489273b 100644
> --- a/doc/guides/rel_notes/release_19_05.rst
> +++ b/doc/guides/rel_notes/release_19_05.rst
> @@ -148,10 +148,16 @@ New Features
>     * Added support for multiport InfiniBand device.
>     * Added control of excessive memory pinning by kernel.
>     * Added support of DMA memory registration by secondary process.
> -   * Added Direct Rule support in Direct Verbs flow driver.
>     * Added support of per-process device registers, reserving identical VA
> space
>       is not needed anymore.
> -   * Added E-Switch support in Direct Verbs flow driver.
> +   * Added Direct Rule support for Nic steering, in Direct Verbs driver.
> +   * Added Direct Rule support for E-Switch steering, in Direct Verbs driver.

People who read the release notes doesn't necessarily read the entire mlx5 manual. It is not clear what Direct rule is and why it is good.
I would drop this part.


> +   * Added support for jump action for both E-Switch and Nic when using
> +     Direct Rules.

How can user know if it uses Direct rule or not? I would just stick with:
"Support rte_flow jump action"

> +   * Added Group Support for Nic steering when using Direct Rules.

Again:
"Support multiple rte_flow groups"

> +   * Support millions of offloaded flow rules.
> +   * Improved flow insertion rate to millions of flows per second when using
> +     Direct Rules and group larger then 0.

I would say:
* flow engine re-design to support large scale deployments. This include:
1. support millions of rte_flow rules
2. fast flow insertion and deletion up to 1M flow update per second. 


> 
>  * **Renamed avf to iavf.**
> 
> --
> 1.8.3.1



More information about the dev mailing list