[dpdk-dev] [EXT] [PATCH v3 6/6] spinlock: ticket based to improve fairness

Gavin Hu (Arm Technology China) Gavin.Hu at arm.com
Thu Dec 27 11:05:36 CET 2018



> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj at marvell.com>
> Sent: Thursday, December 27, 2018 2:58 PM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu at arm.com>; dev at dpdk.org
> Cc: david.marchand at redhat.com; chaozhu at linux.vnet.ibm.com; nd
> <nd at arm.com>; bruce.richardson at intel.com; thomas at monjalon.net; Joyce
> Kong (Arm Technology China) <Joyce.Kong at arm.com>;
> hemant.agrawal at nxp.com; stephen at networkplumber.org; Honnappa
> Nagarahalli <Honnappa.Nagarahalli at arm.com>
> Subject: Re: [EXT] [PATCH v3 6/6] spinlock: ticket based to improve fairness
> 
> On Thu, 2018-12-27 at 12:13 +0800, Gavin Hu wrote:
> > -------------------------------------------------------------------
> > ---
> > From: Joyce Kong <joyce.kong at arm.com>
> >
> > The old implementation is unfair, some threads may take locks
> > aggressively
> 
> I think, one issue here is x86 and ppc follows traditional spinlock and
> arm64 will be following ticket lock for spinlock implementation.
> This would change application behaviour on arm64 compared to x86 and
> ppc.
> 
> How about having a separate API for ticket lock? That would give,
> # application choice to use the locking strategy
> # application behaviour will be same across all arch.

Ok, will do in v4 to have a new named rte_ticket_spinlock API.

> Initial ticket lock implementation can be generic with C11 memory
> primitive, latter arch can optimize it, if required.
Yes, latter we might optimize with new instructions. 

> > while leaving the other threads starving for long time. As shown in
> > the
> > following test, within same period of time, there are threads taking
> > locks
> > much more times than the others.
> >
> >
> >  #ifdef RTE_FORCE_INTRINSICS
> >  static inline void
> > -rte_spinlock_unlock (rte_spinlock_t *sl)
> > +rte_spinlock_unlock(rte_spinlock_t *sl)
> >  {
> > -	__atomic_store_n(&sl->locked, 0, __ATOMIC_RELEASE);
> > +	uint16_t i = __atomic_load_n(&sl->s.current, __ATOMIC_RELAXED);
> > +	i++;
> > +	__atomic_store_n(&sl->s.current, i, __ATOMIC_RELAXED);
> 
> Shouldn't we use __ATOMIC_RELEASE here to pair with lock() ?
> 
> 
> >  }
> >  #endif
> >
> > @@ -98,16 +100,19 @@ rte_spinlock_unlock (rte_spinlock_t *sl)
> >   *   1 if the lock is successfully taken; 0 otherwise.
> >   */
> >  static inline int
> > -rte_spinlock_trylock (rte_spinlock_t *sl);
> > +rte_spinlock_trylock(rte_spinlock_t *sl);
> >
> >  #ifdef RTE_FORCE_INTRINSICS
> >  static inline int
> > -rte_spinlock_trylock (rte_spinlock_t *sl)
> > +rte_spinlock_trylock(rte_spinlock_t *sl)
> >  {
> > -	int exp = 0;
> > -	return __atomic_compare_exchange_n(&sl->locked, &exp, 1,
> > -				0, /* disallow spurious failure */
> > -				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
> > +	uint16_t me = __atomic_fetch_add(&sl->s.next, 1,
> > __ATOMIC_RELAXED);
> > +	while (__atomic_load_n(&sl->s.current, __ATOMIC_RELAXED) != me)
> > {
> > +		__atomic_sub_fetch(&sl->s.next, 1, __ATOMIC_RELAXED);
> > +		return 0;
> > +	}
> > +
> 
> Shouldn't we need CAS here?
> Similar implementation here:
> https://git.linaro.org/lng/odp.git/tree/platform/linux-
> generic/include/odp/api/plat/ticketlock_inlines.h

That is correct, CAS is required. 
Assume T2 takes precedence for requesting a ticket, eg. T2.next = 2, but did not take the lock yet, eg. T2.current = 1, then T1 trylock, T1.next = 3, then T2 takes the lock, T2.next = T2.current = 2, T1 fallback, T1.next = 2, both next = 2, that's not correct, next current will be 3, T1 will not get the lock any more.
> 
> > +	return 1;
> >  }
> >  #endif
> >
> >


More information about the dev mailing list