[PATCH v2 3/3] app/testpmd: interleave SSE SIMD

Vipin Varghese vipin.varghese at amd.com
Wed Aug 21 16:38:57 CEST 2024


Interleaving SSE SIMD load, shuffle, and store, helps to
improve the overall mac-swapp Mpps for both RX and TX.

Test Result:
 * Platform: AMD EPYC 9554 @3.1GHz, no boost
 * Test scenarios: TEST-PMD 64B IO vs MAC-SWAP
 * NIC: broadcom P2100: loopback 2*100Gbps

 <mode : Mpps Ingress: Mpps Egress>
 ------------------------------------------------
  - MAC-SWAP original: 45.75 : 43.8
  - MAC-SWAP register mod: 45.73 : 44.83
  - MAC-SWAP register+ofl mod: 46.36 : 44.79
  - MAC-SWAP register+ofl+interleave mod: 46.0 : 45.1

Signed-off-by: Vipin Varghese <vipin.varghese at amd.com>
---
 app/test-pmd/macswap_sse.h | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/app/test-pmd/macswap_sse.h b/app/test-pmd/macswap_sse.h
index 67ff7fdfbb..1f547388b7 100644
--- a/app/test-pmd/macswap_sse.h
+++ b/app/test-pmd/macswap_sse.h
@@ -52,23 +52,25 @@ do_macswap(struct rte_mbuf *pkts[], uint16_t nb,
 		addr1 = _mm_loadu_si128((__m128i *)eth_hdr[1]);
 		mbuf_field_set(mb[1], ol_flags);
 
+		addr0 = _mm_shuffle_epi8(addr0, shfl_msk);
+
 		mb[2] = pkts[i++];
 		eth_hdr[2] = rte_pktmbuf_mtod(mb[2], struct rte_ether_hdr *);
 		addr2 = _mm_loadu_si128((__m128i *)eth_hdr[2]);
 		mbuf_field_set(mb[2], ol_flags);
 
+		addr1 = _mm_shuffle_epi8(addr1, shfl_msk);
+		_mm_storeu_si128((__m128i *)eth_hdr[0], addr0);
+
 		mb[3] = pkts[i++];
 		eth_hdr[3] = rte_pktmbuf_mtod(mb[3], struct rte_ether_hdr *);
 		addr3 = _mm_loadu_si128((__m128i *)eth_hdr[3]);
 		mbuf_field_set(mb[3], ol_flags);
 
-		addr0 = _mm_shuffle_epi8(addr0, shfl_msk);
-		addr1 = _mm_shuffle_epi8(addr1, shfl_msk);
 		addr2 = _mm_shuffle_epi8(addr2, shfl_msk);
-		addr3 = _mm_shuffle_epi8(addr3, shfl_msk);
-
-		_mm_storeu_si128((__m128i *)eth_hdr[0], addr0);
 		_mm_storeu_si128((__m128i *)eth_hdr[1], addr1);
+
+		addr3 = _mm_shuffle_epi8(addr3, shfl_msk);
 		_mm_storeu_si128((__m128i *)eth_hdr[2], addr2);
 		_mm_storeu_si128((__m128i *)eth_hdr[3], addr3);
 
-- 
2.34.1



More information about the dev mailing list