[dpdk-dev] [PATCH 2/2] net/i40e: remove compiler barrier for aarch64

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Thu Aug 29 00:48:31 CEST 2019


> 
> As packet length extraction code was simplified,the ordering was not
> necessary any more.[1]
IMO, there is no relationship between the compiler barrier and [1] at least on Arm platforms. I suggest we just say 'there is no reason for the compiler barrier'.
I think this compiler barrier is not required for x86/PPC as well.

> 
> 2% performance gain was measured on Marvell ThunderX2.
> 4.3% performance gain was measure on Ampere eMAG80
> 
> [1] http://mails.dpdk.org/archives/dev/2016-April/037529.html
> 
> Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
> Cc: stable at dpdk.org
> 
> Signed-off-by: Gavin Hu <gavin.hu at arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> Reviewed-by: Steve Capper <steve.capper at arm.com>
> ---
>  drivers/net/i40e/i40e_rxtx_vec_neon.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> index 5555e9b..864eb9a 100644
> --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> @@ -307,9 +307,6 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq,
> struct rte_mbuf **rx_pkts,
>  			rte_mbuf_prefetch_part2(rx_pkts[pos + 3]);
>  		}
> 
> -		/* avoid compiler reorder optimization */
> -		rte_compiler_barrier();
> -
>  		/* pkt 3,4 shift the pktlen field to be 16-bit aligned*/
>  		uint32x4_t len3 =
> vshlq_u32(vreinterpretq_u32_u64(descs[3]),
>  					    len_shl);
> --
> 2.7.4



More information about the dev mailing list