[dpdk-dev] [EXT] [PATCH v8 3/3] spinlock: reimplement with atomic one-way barrier builtins

Gavin Hu (Arm Technology China) Gavin.Hu at arm.com
Thu Mar 14 03:36:40 CET 2019



> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> Sent: Thursday, March 14, 2019 8:31 AM
> To: jerinj at marvell.com; Gavin Hu (Arm Technology China)
> <Gavin.Hu at arm.com>; dev at dpdk.org
> Cc: i.maximets at samsung.com; chaozhu at linux.vnet.ibm.com; nd
> <nd at arm.com>; Nipun.gupta at nxp.com; thomas at monjalon.net;
> hemant.agrawal at nxp.com; stable at dpdk.org; nd <nd at arm.com>
> Subject: RE: [EXT] [PATCH v8 3/3] spinlock: reimplement with atomic one-
> way barrier builtins
> 
> > > -------------------------------------------------------------------
> > > ---
> > > The __sync builtin based implementation generates full memory barriers
> > > ('dmb ish') on Arm platforms. Using C11 atomic builtins to generate
> > > one way barriers.
> > >
> > >
> > >  lib/librte_eal/common/include/generic/rte_spinlock.h | 18
> > > +++++++++++++-----
> > >  1 file changed, 13 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h
> > > b/lib/librte_eal/common/include/generic/rte_spinlock.h
> > > index c4c3fc3..87ae7a4 100644
> > > --- a/lib/librte_eal/common/include/generic/rte_spinlock.h
> > > +++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
> > > @@ -61,9 +61,14 @@ rte_spinlock_lock(rte_spinlock_t *sl);  static
> > > inline void  rte_spinlock_lock(rte_spinlock_t *sl)  {
> > > -	while (__sync_lock_test_and_set(&sl->locked, 1))
> > > -		while(sl->locked)
> > > +	int exp = 0;
> > > +
> > > +	while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0,
> > > +				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
> > {
> >
> > Would it be clean to use __atomic_test_and_set() to avoid explicit exp = 0.
> We addressed it here: http://mails.dpdk.org/archives/dev/2019-
> January/122363.html
__atomic_test_and_set causes 10 times of performance degradation in our
micro benchmarking on ThunderX2. Here it is explained why:
http://mails.dpdk.org/archives/dev/2019-January/123340.html 
> 
> >
> >
> > > +		while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED))
> > >  			rte_pause();
> > > +		exp = 0;
> > > +	}
> > >  }
> > >  #endif
> > >
> > > @@ -80,7 +85,7 @@ rte_spinlock_unlock (rte_spinlock_t *sl);  static
> > > inline void  rte_spinlock_unlock (rte_spinlock_t *sl)  {
> > > -	__sync_lock_release(&sl->locked);
> > > +	__atomic_store_n(&sl->locked, 0, __ATOMIC_RELEASE);
> >
> > __atomic_clear(.., __ATOMIC_RELEASE) looks more clean to me.
> This needs the operand to be of type bool.
> 
> >
> > >  }
> > >  #endif
> > >
> > > @@ -99,7 +104,10 @@ rte_spinlock_trylock (rte_spinlock_t *sl);  static
> > > inline int  rte_spinlock_trylock (rte_spinlock_t *sl)  {
> > > -	return __sync_lock_test_and_set(&sl->locked,1) == 0;
> > > +	int exp = 0;
> > > +	return __atomic_compare_exchange_n(&sl->locked, &exp, 1,
> > > +				0, /* disallow spurious failure */
> > > +				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
> >
> > return  (__atomic_test_and_set(.., __ATOMIC_ACQUIRE) == 0) will be
> more
> > clean version.
> >
> > >  }
> > >  #endif
> > >
> > > @@ -113,7 +121,7 @@ rte_spinlock_trylock (rte_spinlock_t *sl)
> > >   */
> > >  static inline int rte_spinlock_is_locked (rte_spinlock_t *sl)  {
> > > -	return sl->locked;
> > > +	return __atomic_load_n(&sl->locked, __ATOMIC_ACQUIRE);
> >
> > Does __ATOMIC_RELAXED will be sufficient?
> This is also addressed here: http://mails.dpdk.org/archives/dev/2019-
> January/122363.html
> 
> I think you approved the patch here:
> http://mails.dpdk.org/archives/dev/2019-January/123238.html
> I think this patch just needs your reviewed-by tag :)
> 
> >
> >
> > >  }
> > >
> > >  /**


More information about the dev mailing list