[dpdk-dev] [RFC] eal: adjust barriers for IO on Armv8-a

Ruifeng Wang Ruifeng.Wang at arm.com
Tue May 12 08:18:43 CEST 2020


> -----Original Message-----
> From: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> Sent: Tuesday, May 12, 2020 2:07 AM
> To: dev at dpdk.org; jerinj at marvell.com; hemant.agrawal at nxp.com; Ajit
> Khaparde (ajit.khaparde at broadcom.com) <ajit.khaparde at broadcom.com>;
> igorch at amazon.com; thomas at monjalon.net; viacheslavo at mellanox.com;
> arybchenko at solarflare.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>
> Cc: Ruifeng Wang <Ruifeng.Wang at arm.com>; nd <nd at arm.com>
> Subject: [RFC] eal: adjust barriers for IO on Armv8-a
> 
> Change the barrier APIs for IO to reflect that Armv8-a is other-multi-copy
> atomicity memory model.
> 
> Armv8-a memory model has been strengthened to require other-multi-copy
> atomicity. This property requires memory accesses from an observer to
> become visible to all other observers simultaneously [3]. This means
> 
> a) A write arriving at an endpoint shared between multiple CPUs is
>    visible to all CPUs
> b) A write that is visible to all CPUs is also visible to all other
>    observers in the shareability domain
> 
> This allows for using cheaper DMB instructions in the place of DSB for devices
> that are visible to all CPUs (i.e. devices that DPDK caters to).
> 
> Please refer to [1], [2] and [3] for more information.
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i
> d=22ec71615d824f4f11d38d0e55a88d8956b7e45f
> [2] https://www.youtube.com/watch?v=i6DayghhA8Q
> [3] https://www.cl.cam.ac.uk/~pes20/armv8-mca/
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> ---
>  lib/librte_eal/arm/include/rte_atomic_64.h | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_eal/arm/include/rte_atomic_64.h
> b/lib/librte_eal/arm/include/rte_atomic_64.h
> index 7b7099cdc..e406411bb 100644
> --- a/lib/librte_eal/arm/include/rte_atomic_64.h
> +++ b/lib/librte_eal/arm/include/rte_atomic_64.h
> @@ -19,11 +19,11 @@ extern "C" {
>  #include <rte_compat.h>
>  #include <rte_debug.h>
> 
> -#define rte_mb() asm volatile("dsb sy" : : : "memory")
> +#define rte_mb() asm volatile("dmb osh" : : : "memory")
> 
> -#define rte_wmb() asm volatile("dsb st" : : : "memory")
> +#define rte_wmb() asm volatile("dmb oshst" : : : "memory")
> 
> -#define rte_rmb() asm volatile("dsb ld" : : : "memory")
> +#define rte_rmb() asm volatile("dmb oshld" : : : "memory")
> 
>  #define rte_smp_mb() asm volatile("dmb ish" : : : "memory")
> 
> @@ -37,9 +37,9 @@ extern "C" {
> 
>  #define rte_io_rmb() rte_rmb()
> 
> -#define rte_cio_wmb() asm volatile("dmb oshst" : : : "memory")
> +#define rte_cio_wmb() rte_wmb()
> 
> -#define rte_cio_rmb() asm volatile("dmb oshld" : : : "memory")
> +#define rte_cio_rmb() rte_rmb()
> 
>  /*------------------------ 128 bit atomic operations -------------------------*/
> 
> --
> 2.17.1

This change showed about 7% performance gain in testpmd single core NDR test.
Tested-by: Ruifeng Wang <ruifeng.wang at arm.com>



More information about the dev mailing list