[dpdk-dev] [PATCH v2] net/mlx4: workaround to verbs wrong error return

Matan Azrad matan at mellanox.com
Wed Aug 2 21:00:50 CEST 2017


Current mlx4 OFED version has bug which returns error to
ibv destroy functions when the device was plugged out, in
spite of the resources were destroyed correctly.

Hence, failsafe PMD was aborted, only in debug mode, when
it tries to remove the device in plug-out process.

The workaround added option to replace all claim_zero
assertions with debugging messages, by the way, this option
affects non ibv destroy assertions.

DPDK 18.02 release should work with Mellanox OFED-4.2 which will
include the verbs fix to this bug, then, this patch can
be removed.

Signed-off-by: Matan Azrad <matan at mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil at 6wind.com>
---
 config/common_base        | 1 +
 doc/guides/nics/mlx4.rst  | 8 ++++++++
 drivers/net/mlx4/Makefile | 4 ++++
 drivers/net/mlx4/mlx4.h   | 6 ++++++
 4 files changed, 19 insertions(+)

This v2 is not perfect but satisfy.

diff --git a/config/common_base b/config/common_base
index 7805605..5e97a08 100644
--- a/config/common_base
+++ b/config/common_base
@@ -213,6 +213,7 @@ CONFIG_RTE_LIBRTE_FM10K_INC_VECTOR=y
 #
 CONFIG_RTE_LIBRTE_MLX4_PMD=n
 CONFIG_RTE_LIBRTE_MLX4_DEBUG=n
+CONFIG_RTE_LIBRTE_MLX4_DEBUG_BROKEN_VERBS=n
 CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N=4
 CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE=0
 CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=8
diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst
index d5bf2b3..f8885b2 100644
--- a/doc/guides/nics/mlx4.rst
+++ b/doc/guides/nics/mlx4.rst
@@ -119,6 +119,14 @@ These options can be modified in the ``.config`` file.
   adds additional run-time checks and debugging messages at the cost of
   lower performance.
 
+- ``CONFIG_RTE_LIBRTE_MLX4_DEBUG_BROKEN_VERBS`` (default **n**)
+
+  Mellanox OFED versions earlier than 4.2 may return false errors from
+  Verbs object destruction APIs after the device is plugged out.
+  Enabling this option replaces assertion checks that cause the program
+  to abort with harmless debugging messages as a workaround.
+  Relevant only when CONFIG_RTE_LIBRTE_MLX4_DEBUG is enabled.
+
 - ``CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N`` (default **4**)
 
   Number of scatter/gather elements (SGEs) per work request (WR). Lowering
diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index 755c8a4..c045bd7 100644
--- a/drivers/net/mlx4/Makefile
+++ b/drivers/net/mlx4/Makefile
@@ -84,6 +84,10 @@ ifdef CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS
 CFLAGS += -DMLX4_PMD_SOFT_COUNTERS=$(CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS)
 endif
 
+ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DEBUG_BROKEN_VERBS),y)
+CFLAGS += -DMLX4_PMD_DEBUG_BROKEN_VERBS
+endif
+
 include $(RTE_SDK)/mk/rte.lib.mk
 
 # Generate and clean-up mlx4_autoconf.h.
diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
index a2e0ae7..c0ade4f 100644
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -182,7 +182,13 @@ enum {
 		(DEBUG__(__VA_ARGS__), 0)	\
 	})[0])
 #define DEBUG(...) DEBUG_(__VA_ARGS__, '\n')
+#ifndef MLX4_PMD_DEBUG_BROKEN_VERBS
 #define claim_zero(...) assert((__VA_ARGS__) == 0)
+#else /* MLX4_PMD_DEBUG_BROKEN_VERBS */
+#define claim_zero(...) \
+	(void)(((__VA_ARGS__) == 0) || \
+		DEBUG("Assertion `(" # __VA_ARGS__ ") == 0' failed (IGNORED)."))
+#endif /* MLX4_PMD_DEBUG_BROKEN_VERBS */
 #define claim_nonzero(...) assert((__VA_ARGS__) != 0)
 #define claim_positive(...) assert((__VA_ARGS__) >= 0)
 #else /* NDEBUG */
-- 
1.8.3.1



More information about the dev mailing list