patch 'net/i40e: fix AVX-512 pointer copy on 32-bit' has been queued to stable release 23.11.3
Xueming Li
xuemingl at nvidia.com
Mon Nov 11 07:27:55 CET 2024
Hi,
FYI, your patch has been queued to stable release 23.11.3
Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 11/30/24. So please
shout if anyone has objections.
Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.
Queued patches are on a temporary branch at:
https://git.dpdk.org/dpdk-stable/log/?h=23.11-staging
This queued commit can be viewed at:
https://git.dpdk.org/dpdk-stable/commit/?h=23.11-staging&id=ff90a3bb8523c29d5e02b6ff2c8e79345ba177be
Thanks.
Xueming Li <xuemingl at nvidia.com>
---
>From ff90a3bb8523c29d5e02b6ff2c8e79345ba177be Mon Sep 17 00:00:00 2001
From: Bruce Richardson <bruce.richardson at intel.com>
Date: Fri, 6 Sep 2024 15:11:24 +0100
Subject: [PATCH] net/i40e: fix AVX-512 pointer copy on 32-bit
Cc: Xueming Li <xuemingl at nvidia.com>
[ upstream commit 2d040df2437a025ef6d2ecf72de96d5c9fe97439 ]
The size of a pointer on 32-bit is only 4 rather than 8 bytes, so
copying 32 pointers only requires half the number of AVX-512 load store
operations.
Fixes: 5171b4ee6b6b ("net/i40e: optimize Tx by using AVX512")
Signed-off-by: Bruce Richardson <bruce.richardson at intel.com>
Acked-by: Ian Stokes <ian.stokes at intel.com>
---
drivers/net/i40e/i40e_rxtx_vec_avx512.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx512.c b/drivers/net/i40e/i40e_rxtx_vec_avx512.c
index f3050cd06c..62fce19dc4 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_avx512.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_avx512.c
@@ -799,6 +799,7 @@ i40e_tx_free_bufs_avx512(struct i40e_tx_queue *txq)
uint32_t copied = 0;
/* n is multiple of 32 */
while (copied < n) {
+#ifdef RTE_ARCH_64
const __m512i a = _mm512_load_si512(&txep[copied]);
const __m512i b = _mm512_load_si512(&txep[copied + 8]);
const __m512i c = _mm512_load_si512(&txep[copied + 16]);
@@ -808,6 +809,12 @@ i40e_tx_free_bufs_avx512(struct i40e_tx_queue *txq)
_mm512_storeu_si512(&cache_objs[copied + 8], b);
_mm512_storeu_si512(&cache_objs[copied + 16], c);
_mm512_storeu_si512(&cache_objs[copied + 24], d);
+#else
+ const __m512i a = _mm512_load_si512(&txep[copied]);
+ const __m512i b = _mm512_load_si512(&txep[copied + 16]);
+ _mm512_storeu_si512(&cache_objs[copied], a);
+ _mm512_storeu_si512(&cache_objs[copied + 16], b);
+#endif
copied += 32;
}
cache->len += n;
--
2.34.1
---
Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- - 2024-11-11 14:23:08.441488072 +0800
+++ 0069-net-i40e-fix-AVX-512-pointer-copy-on-32-bit.patch 2024-11-11 14:23:05.172192839 +0800
@@ -1 +1 @@
-From 2d040df2437a025ef6d2ecf72de96d5c9fe97439 Mon Sep 17 00:00:00 2001
+From ff90a3bb8523c29d5e02b6ff2c8e79345ba177be Mon Sep 17 00:00:00 2001
@@ -4,0 +5,3 @@
+Cc: Xueming Li <xuemingl at nvidia.com>
+
+[ upstream commit 2d040df2437a025ef6d2ecf72de96d5c9fe97439 ]
@@ -11 +13,0 @@
-Cc: stable at dpdk.org
@@ -20 +22 @@
-index 0238b03f8a..3b2750221b 100644
+index f3050cd06c..62fce19dc4 100644
More information about the stable
mailing list