[dpdk-dev] [PATCH 1/2] net/i40e: desc loading is unnecessarily ordered for aarch64

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Thu Aug 29 00:09:54 CEST 2019


Thanks Gavin, few comments are inline

> -----Original Message-----
> From: Gavin Hu <gavin.hu at arm.com>
> Sent: Tuesday, August 13, 2019 5:44 AM
> To: dev at dpdk.org
> Cc: nd <nd at arm.com>; thomas at monjalon.net; jerinj at marvell.com;
> pbhagavatula at marvell.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; qi.z.zhang at intel.com;
> bruce.richardson at intel.com; stable at dpdk.org
> Subject: [PATCH 1/2] net/i40e: desc loading is unnecessarily ordered for
> aarch64
> 
> For x86, the descriptors needs to be loaded in order, so in between two
> descriptors loading, there is a compiler barrier in place.
IMO, we can skip the above as this change applies to Arm platforms. Instead, capture this in the code in comments to explain why the ordering of the loads is not required. This will help others reading the code. 

[1] For aarch64, a
> patch [2] is in place to survive with discontinuous DD bits, the barriers can be
> removed to take full advantage of out-of-order execution.
> 
> 50% performance gain in the RFC2544 NDR test was measured on ThunderX2.
> 12.50% performan gain in the RFC2544 NDR test was measured on Ampere
> eMAG80 platform.
> 
> [1]
> http://inbox.dpdk.org/users/039ED4275CED7440929022BC67E7061153D71
> 548@
> SHSMSX105.ccr.corp.intel.com/
> [2] https://mails.dpdk.org/archives/stable/2017-October/003324.html
> 
> Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
> Cc: stable at dpdk.org
> 
> Signed-off-by: Gavin Hu <gavin.hu at arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> Reviewed-by: Steve Capper <steve.capper at arm.com>
> ---
>  drivers/net/i40e/i40e_rxtx_vec_neon.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> index 83572ef..5555e9b 100644
> --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> @@ -285,7 +285,6 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq,
> struct rte_mbuf **rx_pkts,
>  		/* Read desc statuses backwards to avoid race condition */
>  		/* A.1 load 4 pkts desc */
>  		descs[3] =  vld1q_u64((uint64_t *)(rxdp + 3));
> -		rte_rmb();
> 
>  		/* B.2 copy 2 mbuf point into rx_pkts  */
>  		vst1q_u64((uint64_t *)&rx_pkts[pos], mbp1);
> --
> 2.7.4


More information about the dev mailing list