[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

Chao CH Zhu bjzhuc at cn.ibm.com
Thu Oct 16 05:14:02 CEST 2014


Konstantin,

In my understanding, compiler barrier is a kind of software barrier which 
prevents the compiler from moving memory accesses across the barrier. This 
should be architecture-independent. And the "sync" instruction is a 
hardware barrier which depends on PowerPC architecture. So I think the 
compiler barrier should be the same on x86 and PowerPC. Any comments? 
Please correct me if I was wrong.

Thanks a lot! 
 
Best Regards!
------------------------------
Chao Zhu 




From:   "Ananyev, Konstantin" <konstantin.ananyev at intel.com>
To:     Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" <dev at dpdk.org>
Date:   2014/10/16 08:38
Subject:        RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM 
Power   architecture




Hi,

> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu
> Sent: Friday, September 26, 2014 10:36 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power 
architecture
> 
> The atomic operations implemented with assembly code in DPDK only
> support x86. This patch add architecture specific atomic operations for
> IBM Power architecture.
> 
> Signed-off-by: Chao Zhu <bjzhuc at cn.ibm.com>
> ---
>  .../common/include/powerpc/arch/rte_atomic.h       |  387 
++++++++++++++++++++
>  .../common/include/powerpc/arch/rte_atomic_arch.h  |  318 
++++++++++++++++
>  2 files changed, 705 insertions(+), 0 deletions(-)
>  create mode 100644 
lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
>  create mode 100644 
lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> 
...
> +
> diff --git 
a/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> b/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> new file mode 100644
> index 0000000..fe5666e
> --- /dev/null
> +
...
>+#define                rte_arch_rmb() asm volatile("sync" : : : 
"memory")
>+
> +#define               rte_arch_compiler_barrier() do {        \
> +              asm volatile ("" : : : "memory");               \
> +} while(0)

I don't know much about PPC architecture, but as I remember it uses a 
weakly-ordering memory model.
Is that correct?
If so, then you probably need rte_arch_compiler_barrier() to be "sync" 
instruction (like mb()s above) .
The reason is that IA has much stronger memory ordering model and there 
are a lot of places in the code where it implies that  ordering.
For example - ring enqueue/dequeue functions. 

Konstantin




More information about the dev mailing list