[dpdk-dev] [PATCH v4 4/4] eal/atomic: add wrapper for c11 atomics

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Wed May 13 17:06:15 CEST 2020


> > <snip>
> >
> > Subject: Re: [PATCH v4 4/4] eal/atomic: add wrapper for c11 atomics
> >
> > On Tue, May 12, 2020 at 4:03 pm, Phil Yang <mailto:phil.yang at arm.com>
> wrote:
> >
> > parameter. Signed-off-by: Phil Yang <mailto:phil.yang at arm.com>
> >
> >
> > What is the purpose of having rte_atomic at all?
> > Is this level of indirection really helping?
> > [HONNAPPA] (not sure why this email has html format, converted to text
> > format) I believe you meant, why not use the __atomic_xxx built-ins
> > directly? The only reason for now is handling of
> > __atomic_thread_fence(__ATOMIC_SEQ_CST) for x86. This is equivalent to
> rte_smp_mb which has an optimized implementation for x86.
> > According to Konstantin, the compiler does not generate optimal code.
> Wrapping that built-in alone is going to be confusing.
> >
> > The wrappers also allow us to have our own implementation using inline
> > assembly for compilers versions that do not support C11 atomic built-ins.
> But, I do not know if there is a need to support those versions.
> 
> Few thoughts from my side about that patch:
> Yes, for __atomic_thread_fence(__ATOMIC_SEQ_CST) generates full 'mfence'
> which is quite expensive, and can be a avoided for SMP case.
> Though I don't see why we need to create our own wrappers  for *all*
> __atomic buitins.
> From my perspective it would be sufficient to just introduce few of them:
> rte_thread_fence_XXX (where XXX - supported memory-orders: RELEASE,
> ACUIQRE, SEQ_CST, etc.).
> For all other __atomic built-ins I don't see any problem to use them directly,
> without introducing any wrappers around.
I am all for not doing wrappers for the sake of doing. Here, we were concerned about the uniformity of the code, hence did the wrappers for all. Does, anyone have any concerns with doing the wrappers only for __atomic_thread_fence?

Is there any possibility that the compiler will change in the future to generate the optimized code for x86?

For the API, we already have 'rte_atomic128_cmp_exchange' implemented with C11 semantics, I suggest we keep this one also on the same lines. This would require the memory order to be a parameter.

> 
> As a side note, this patch implements rte_atomic_thread_fence() as a simple
> wrapper around __atomic_thread_fence(), so concern mentioned above is not
> addressed.
Agreed. So, we will just pick the implementation of rte_smp_mb for x86 for this.


More information about the dev mailing list