[dpdk-dev] [EXT] [PATCH v3 6/6] spinlock: ticket based to improve fairness

Jerin Jacob Kollanukkaran jerinj at marvell.com
Thu Dec 27 07:58:02 CET 2018


On Thu, 2018-12-27 at 12:13 +0800, Gavin Hu wrote:
> -------------------------------------------------------------------
> ---
> From: Joyce Kong <joyce.kong at arm.com>
> 
> The old implementation is unfair, some threads may take locks
> aggressively

I think, one issue here is x86 and ppc follows traditional spinlock and
arm64 will be following ticket lock for spinlock implementation.
This would change application behaviour on arm64 compared to x86 and
ppc.

How about having a separate API for ticket lock? That would give,
# application choice to use the locking strategy
# application behaviour will be same across all arch.

Initial ticket lock implementation can be generic with C11 memory
primitive, latter arch can optimize it, if required.

> while leaving the other threads starving for long time. As shown in
> the
> following test, within same period of time, there are threads taking
> locks
> much more times than the others.
> 
>  
>  #ifdef RTE_FORCE_INTRINSICS
>  static inline void
> -rte_spinlock_unlock (rte_spinlock_t *sl)
> +rte_spinlock_unlock(rte_spinlock_t *sl)
>  {
> -	__atomic_store_n(&sl->locked, 0, __ATOMIC_RELEASE);
> +	uint16_t i = __atomic_load_n(&sl->s.current, __ATOMIC_RELAXED);
> +	i++;
> +	__atomic_store_n(&sl->s.current, i, __ATOMIC_RELAXED);

Shouldn't we use __ATOMIC_RELEASE here to pair with lock() ?


>  }
>  #endif
>  
> @@ -98,16 +100,19 @@ rte_spinlock_unlock (rte_spinlock_t *sl)
>   *   1 if the lock is successfully taken; 0 otherwise.
>   */
>  static inline int
> -rte_spinlock_trylock (rte_spinlock_t *sl);
> +rte_spinlock_trylock(rte_spinlock_t *sl);
>  
>  #ifdef RTE_FORCE_INTRINSICS
>  static inline int
> -rte_spinlock_trylock (rte_spinlock_t *sl)
> +rte_spinlock_trylock(rte_spinlock_t *sl)
>  {
> -	int exp = 0;
> -	return __atomic_compare_exchange_n(&sl->locked, &exp, 1,
> -				0, /* disallow spurious failure */
> -				__ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
> +	uint16_t me = __atomic_fetch_add(&sl->s.next, 1,
> __ATOMIC_RELAXED);
> +	while (__atomic_load_n(&sl->s.current, __ATOMIC_RELAXED) != me)
> {
> +		__atomic_sub_fetch(&sl->s.next, 1, __ATOMIC_RELAXED);
> +		return 0;
> +	}
> +

Shouldn't we need CAS here?
Similar implementation here:
https://git.linaro.org/lng/odp.git/tree/platform/linux-generic/include/odp/api/plat/ticketlock_inlines.h


> +	return 1;
>  }
>  #endif
>  
> 


More information about the dev mailing list