[dpdk-stable] patch 'net/mlx5: fix instruction hotspot on replenishing Rx buffer' has been queued to LTS release 17.11.7

Yongseok Koh yskoh at mellanox.com
Tue Jul 23 02:59:29 CEST 2019


Hi,

FYI, your patch has been queued to LTS release 17.11.7

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objection by 07/27/19. So please
shout if anyone has objection.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Thanks.

Yongseok

---
>From 0ef4e25f6173b692ca138ab0aba42d80a8645e28 Mon Sep 17 00:00:00 2001
From: Yongseok Koh <yskoh at mellanox.com>
Date: Mon, 14 Jan 2019 13:16:22 -0800
Subject: [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer

[ backported from upstream commit 9c55c6bd86156d17df93bf947dc620222ee9f7e4 ]

On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
to be accessed as it is static and easily calculated from the mbuf address.
Accessing the mbuf content causes unnecessary load stall. non-x86
processors (mostly RISC such as ARM and Power) are more vulnerable to load
stall. For x86, reducing the number of instructions seems to matter most.

Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")

Signed-off-by: Yongseok Koh <yskoh at mellanox.com>
Acked-by: Shahaf Shuler <shahafs at mellanox.com>
---
 drivers/net/mlx5/mlx5_rxtx_vec.h | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index 750559b8d1..e7367b74d8 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -116,7 +116,22 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
 		return;
 	}
 	for (i = 0; i < n; ++i) {
-		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
+		void *buf_addr;
+
+		/*
+		 * Load the virtual address for Rx WQE. non-x86 processors
+		 * (mostly RISC such as ARM and Power) are more vulnerable to
+		 * load stall. For x86, reducing the number of instructions
+		 * seems to matter most.
+		 */
+#ifdef RTE_ARCH_X86_64
+		buf_addr = elts[i]->buf_addr;
+#else
+		buf_addr = (char *)elts[i] + sizeof(struct rte_mbuf) +
+			   rte_pktmbuf_priv_size(rxq->mp);
+		assert(buf_addr == elts[i]->buf_addr);
+#endif
+		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
 					      RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKEY in WQEs. */
 		if (unlikely(!IS_SINGLE_MR(rxq->mr_ctrl.bh_n)))
-- 
2.21.0

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2019-07-22 17:55:06.640266135 -0700
+++ 0002-net-mlx5-fix-instruction-hotspot-on-replenishing-Rx-.patch	2019-07-22 17:55:05.698471000 -0700
@@ -1,37 +1,35 @@
-From 9c55c6bd86156d17df93bf947dc620222ee9f7e4 Mon Sep 17 00:00:00 2001
+From 0ef4e25f6173b692ca138ab0aba42d80a8645e28 Mon Sep 17 00:00:00 2001
 From: Yongseok Koh <yskoh at mellanox.com>
-Date: Mon, 25 Mar 2019 12:13:10 -0700
-Subject: [PATCH] net/mlx5: revert mbuf address calculation for x86
+Date: Mon, 14 Jan 2019 13:16:22 -0800
+Subject: [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer
 
-When replenishing mbufs on Rx, buffer address (mbuf->buf_addr) should be
-loaded. non-x86 processors (mostly RISC such as ARM and Power) are more
-vulnerable to load stall. For x86, reducing the number of instructions
-seems to matter most.
-
-For x86, this is simply a load but for other architectures, it is
-calculated from the address of mbuf structure by rte_mbuf_buf_addr()
-without having to load the first cacheline of the mbuf.
+[ backported from upstream commit 9c55c6bd86156d17df93bf947dc620222ee9f7e4 ]
 
-Fixes: 12d468a62bc1 ("net/mlx5: fix instruction hotspot on replenishing Rx buffer")
-Cc: stable at dpdk.org
+On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
+to be accessed as it is static and easily calculated from the mbuf address.
+Accessing the mbuf content causes unnecessary load stall. non-x86
+processors (mostly RISC such as ARM and Power) are more vulnerable to load
+stall. For x86, reducing the number of instructions seems to matter most.
+
+Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
 
 Signed-off-by: Yongseok Koh <yskoh at mellanox.com>
 Acked-by: Shahaf Shuler <shahafs at mellanox.com>
 ---
- drivers/net/mlx5/mlx5_rxtx_vec.h | 14 +++++++++++++-
- 1 file changed, 13 insertions(+), 1 deletion(-)
+ drivers/net/mlx5/mlx5_rxtx_vec.h | 17 ++++++++++++++++-
+ 1 file changed, 16 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
-index 5df8e291e6..4220b08dd2 100644
+index 750559b8d1..e7367b74d8 100644
 --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
 +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
-@@ -102,9 +102,21 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
+@@ -116,7 +116,22 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
  		return;
  	}
  	for (i = 0; i < n; ++i) {
--		void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
+-		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
 +		void *buf_addr;
- 
++
 +		/*
 +		 * Load the virtual address for Rx WQE. non-x86 processors
 +		 * (mostly RISC such as ARM and Power) are more vulnerable to
@@ -40,14 +38,15 @@
 +		 */
 +#ifdef RTE_ARCH_X86_64
 +		buf_addr = elts[i]->buf_addr;
-+		assert(buf_addr == rte_mbuf_buf_addr(elts[i], rxq->mp));
 +#else
-+		buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
- 		assert(buf_addr == elts[i]->buf_addr);
++		buf_addr = (char *)elts[i] + sizeof(struct rte_mbuf) +
++			   rte_pktmbuf_priv_size(rxq->mp);
++		assert(buf_addr == elts[i]->buf_addr);
 +#endif
- 		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
++		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
  					      RTE_PKTMBUF_HEADROOM);
- 		/* If there's only one MR, no need to replace LKey in WQE. */
+ 		/* If there's only one MR, no need to replace LKEY in WQEs. */
+ 		if (unlikely(!IS_SINGLE_MR(rxq->mr_ctrl.bh_n)))
 -- 
 2.21.0
 


More information about the stable mailing list