[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture
Chao CH Zhu
bjzhuc at cn.ibm.com
Thu Oct 16 05:14:02 CEST 2014
Konstantin,
In my understanding, compiler barrier is a kind of software barrier which
prevents the compiler from moving memory accesses across the barrier. This
should be architecture-independent. And the "sync" instruction is a
hardware barrier which depends on PowerPC architecture. So I think the
compiler barrier should be the same on x86 and PowerPC. Any comments?
Please correct me if I was wrong.
Thanks a lot!
Best Regards!
------------------------------
Chao Zhu
From: "Ananyev, Konstantin" <konstantin.ananyev at intel.com>
To: Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" <dev at dpdk.org>
Date: 2014/10/16 08:38
Subject: RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM
Power architecture
Hi,
> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu
> Sent: Friday, September 26, 2014 10:36 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power
architecture
>
> The atomic operations implemented with assembly code in DPDK only
> support x86. This patch add architecture specific atomic operations for
> IBM Power architecture.
>
> Signed-off-by: Chao Zhu <bjzhuc at cn.ibm.com>
> ---
> .../common/include/powerpc/arch/rte_atomic.h | 387
++++++++++++++++++++
> .../common/include/powerpc/arch/rte_atomic_arch.h | 318
++++++++++++++++
> 2 files changed, 705 insertions(+), 0 deletions(-)
> create mode 100644
lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> create mode 100644
lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
>
...
> +
> diff --git
a/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> b/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> new file mode 100644
> index 0000000..fe5666e
> --- /dev/null
> +
...
>+#define rte_arch_rmb() asm volatile("sync" : : :
"memory")
>+
> +#define rte_arch_compiler_barrier() do { \
> + asm volatile ("" : : : "memory"); \
> +} while(0)
I don't know much about PPC architecture, but as I remember it uses a
weakly-ordering memory model.
Is that correct?
If so, then you probably need rte_arch_compiler_barrier() to be "sync"
instruction (like mb()s above) .
The reason is that IA has much stronger memory ordering model and there
are a lot of places in the code where it implies that ordering.
For example - ring enqueue/dequeue functions.
Konstantin
More information about the dev
mailing list