[dpdk-dev] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare exchange

Phil Yang (Arm Technology China) Phil.Yang at arm.com
Wed Oct 16 11:04:05 CEST 2019


> -----Original Message-----
> From: David Marchand <david.marchand at redhat.com>
> Sent: Tuesday, October 15, 2019 8:16 PM
> To: Phil Yang (Arm Technology China) <Phil.Yang at arm.com>
> Cc: thomas at monjalon.net; jerinj at marvell.com; Gage Eads
> <gage.eads at intel.com>; dev <dev at dpdk.org>; hemant.agrawal at nxp.com;
> Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>; Gavin Hu (Arm
> Technology China) <Gavin.Hu at arm.com>; nd <nd at arm.com>
> Subject: Re: [dpdk-dev] [PATCH v9 1/3] eal/arm64: add 128-bit atomic
> compare exchange
> 
> On Tue, Oct 15, 2019 at 1:32 PM Phil Yang (Arm Technology China)
> <Phil.Yang at arm.com> wrote:
> > > -----Original Message-----
> > > From: David Marchand <david.marchand at redhat.com>
> > > If LSE is available, we expose __rte_cas_XX (explicitely) *non*
> > > inlined functions, while without LSE, we expose inlined __rte_ldr_XX
> > > and __rte_stx_XX functions.
> > > So we have a first disparity with non-inlined vs inlined functions
> > > depending on a #ifdef.
> 
> You did not comment on the inline / no inline part and I still see
> this in the v10.
> Is this __rte_noinline on the CAS function intentional?

Apologize for missing this item. Yes, it is to avoid ABI break.
Please check
5b40ec6b966260e0ff66a8a2c689664f75d6a0e6 ("mempool/octeontx2: fix possible arm64 ABI break")

> 
> 
> > > Then, we have a second disparity with two sets of "apis" depending on
> > > this #ifdef.
> > >
> > > And we expose those sets with a rte_ prefix, meaning people will try
> > > to use them, but those are not part of a public api.
> > >
> > > Can't we do without them ? (see below [2] for a proposal with ldr/stx,
> > > cas should be the same)
> >
> > No, it doesn't work.
> > Because we need to verify the return value at the end of the loop for these
> macros.
> 
> Do you mean the return value for the stores?

It is my bad. I missed the ret option in the macro. This approach works.

However, I suggest to keep them as static inline functions rather than a piece of macro in the rte_atomic128_cmp_exchange API.
One reason is APIs name can indicate the memory ordering of these operations.
Moreover, it uses the register type to pass the value in the inline function, so it should not have too much cost comparing with the macro.
I also think these 128bit load and store functions can be used in other places, once it has been proved valuable in rte_atomic128_cmp_exchange API. But let's keep them private for the current stage.
BTW, Linux kernel implemented in the same way. https://github.com/torvalds/linux/blob/master/arch/arm64/include/asm/atomic_lse.h#L19 
 
> > > #define __STORE_128(op_string, dst, val, ret) \
> > >     asm volatile(                        \
> > >         op_string " %w0, %1, %2, %3"     \
> > >         : "=&r" (ret)                    \
> > >         : "r" (val.val[0]),              \
> > >           "r" (val.val[1]),              \
> > >           "Q" (dst->val[0])              \
> > >         : "memory")
> 
> The ret variable is still passed in this macro and the while loop can
> check it later.
> 
> 
> > > > diff --git a/lib/librte_eal/common/include/generic/rte_atomic.h
> > > b/lib/librte_eal/common/include/generic/rte_atomic.h
> > > > index 24ff7dc..e6ab15a 100644
> > > > --- a/lib/librte_eal/common/include/generic/rte_atomic.h
> > > > +++ b/lib/librte_eal/common/include/generic/rte_atomic.h
> > > > @@ -1081,6 +1081,20 @@ static inline void
> > > rte_atomic64_clear(rte_atomic64_t *v)
> > > >
> > > >  /*------------------------ 128 bit atomic operations -------------------------*/
> > > >
> > > > +/**
> > > > + * 128-bit integer structure.
> > > > + */
> > > > +RTE_STD_C11
> > > > +typedef struct {
> > > > +       RTE_STD_C11
> > > > +       union {
> > > > +               uint64_t val[2];
> > > > +#ifdef RTE_ARCH_64
> > > > +               __extension__ __int128 int128;
> > > > +#endif
> > >
> > > You hid this field for x86.
> > > What is the reason?
> > No, we are not hid it for x86. The RTE_ARCH_64 flag covered x86 as well.
> 
> Ah indeed, I read it wrong, ARCH_64 ... AARCH64 ... :-)
> 
> 
> 
> --
> David Marchand



More information about the dev mailing list