[dpdk-dev] [PATCH] examples/l3fwd: fix NEON instructions

Guduri Prathyusha gprathyusha at caviumnetworks.com
Sun Oct 29 08:48:07 CET 2017


To group consecutive packets with same destination port in bursts of 4
neon intrinsic data types dp1 and dp2 are calculated such that if
dst_port[]={a,b,c,d,e,f,g,h,i...} dp1 should contain: <a,b,c,d> and
dp2 should contain: <b,c,d,e> in the first iteration. dp1 should
be <e,f,g,h> and dp2 should be <f,g,h,i> in the next iteration. dp2 in
the last iteration should be <w,x,y,y>.

Whereas the existing code incorrectly calculates dp1 as <d,e,f,g> from
second iteration and thus incorrect calculation of dp2 as <d,e,f,f>
in the last iteration.

This patch fixes the incorrect ARM NEON instructions on dp1 and dp2.

Fixes: 569b290cdb36 ("examples/l3fwd: add NEON implementation")

Signed-off-by: Guduri Prathyusha <gprathyusha at caviumnetworks.com>
---
 examples/l3fwd/l3fwd_neon.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/l3fwd/l3fwd_neon.h b/examples/l3fwd/l3fwd_neon.h
index 42d50d3c2..1eace4e03 100644
--- a/examples/l3fwd/l3fwd_neon.h
+++ b/examples/l3fwd/l3fwd_neon.h
@@ -192,13 +192,13 @@ send_packets_multi(struct lcore_conf *qconf, struct rte_mbuf **pkts_burst,
 			 * dp1:
 			 * <d[j], d[j+1], d[j+2], d[j+3], ... >
 			 */
-			dp1 = vextq_u16(dp1, dp1, FWDSTEP - 1);
+			dp1 = vextq_u16(dp2, vdupq_n_u16(0), FWDSTEP - 1);
 		}

 		/*
 		 * dp2: <d[j-3], d[j-2], d[j-1], d[j-1], ... >
 		 */
-		dp2 = vextq_u16(dp1, dp1, 1);
+		dp2 = vextq_u16(dp1, vdupq_n_u16(0), 1);
 		dp2 = vsetq_lane_u16(vgetq_lane_u16(dp2, 2), dp2, 3);
 		lp  = port_groupx4(&pnum[j - FWDSTEP], lp, dp1, dp2);

--
2.14.1



More information about the dev mailing list