[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

Wodkowski, PawelX pawelx.wodkowski at intel.com
Fri Sep 26 08:33:12 CEST 2014


> Given what you said above, I agree, at least in the current implementation.  It
> still seems like theres a simpler solution that doesn't require all the
> comparative gymnastics.

Yes, there is simpler solution, but this solution involve recursive locking.
DPDK recursive spinlocks are no an option in here, so only option is posix recursive
mutex, which I think is even worst option than this gymnastics.

> 
> What if, instead of testing if you're the callback thread, we turn the executing
> field of alarm_entry into a bitfield, where bit 0 represents the former
> "executing" state, and bit 1 is defined as a "cancelled" bit.  Then
> rte_eal_alarm_cancel becomes a search that, when an alarm is found simply or's
> in the cancelled bit to the executing bit field.  When the callback thread runs,
> it skips executing any alarm that is marked as cancelled, but frees all alarm
> entries that are executed or cancelled.  That gives us a single point at which
> frees of alarm entires happen?  Something like the patch below (completely
> untested)?
> 
> It also seems like the alarm api as a whole could use some improvement.  The
> way its written right now, theres no way to refer to a specific alarm (i.e.
> cancelation relies on the specification of a function and data pointer, which
> may refer to multiple timers).  Shouldn't rte_eal_alarm_set return an opaque
> handle to a unique timer instance that can be store by a caller and used to
> specfically cancel that timer?  Thats how both the bsd and linux timer
> subsystems model timers.
> 

Goal was to not break user applications that use this library.

> 
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_alarm.c
> b/lib/librte_eal/linuxapp/eal/eal_alarm.c
> index 480f0cb..73b6dc5 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_alarm.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_alarm.c
> @@ -64,6 +64,9 @@
>  #define MS_PER_S 1000
>  #define US_PER_S (US_PER_MS * MS_PER_S)
> 
> +#define ALARM_EXECUTING (1 << 0)
> +#define ALARM_CANCELLED (1 << 1)
> +
>  struct alarm_entry {
>  	LIST_ENTRY(alarm_entry) next;
>  	struct timeval time;
> @@ -107,12 +110,14 @@ eal_alarm_callback(struct rte_intr_handle *hdl
> __rte_unused,
>  			gettimeofday(&now, NULL) == 0 &&
>  			(ap->time.tv_sec < now.tv_sec || (ap->time.tv_sec ==
> now.tv_sec &&
>  						ap->time.tv_usec <=
> now.tv_usec))){
> -		ap->executing = 1;
> -		rte_spinlock_unlock(&alarm_list_lk);

Removing unlock here introduce deadlock.

> +		ap->executing |= ALARM_EXECUTING;
> +		if (likely(!(ap->executing & ALARM_CANCELLED)) {
> +			rte_spinlock_unlock(&alarm_list_lk);
> 
> -		ap->cb_fn(ap->cb_arg);
> +			ap->cb_fn(ap->cb_arg);
> 
> -		rte_spinlock_lock(&alarm_list_lk);
> +			rte_spinlock_lock(&alarm_list_lk);
> +		}
>  		LIST_REMOVE(ap, next);
>  		rte_free(ap);
>  	}
> @@ -209,10 +214,9 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn,
> void *cb_arg)
>  	rte_spinlock_lock(&alarm_list_lk);
>  	/* remove any matches at the start of the list */
>  	while ((ap = LIST_FIRST(&alarm_list)) != NULL &&
> -			cb_fn == ap->cb_fn && ap->executing == 0 &&
> +			cb_fn == ap->cb_fn &&
>  			(cb_arg == (void *)-1 || cb_arg == ap->cb_arg)) {
> -		LIST_REMOVE(ap, next);
> -		rte_free(ap);
> +		ap->executing |= ALARM_CANCELLED;
>  		count++;
>  	}
>  	ap_prev = ap;
> @@ -220,10 +224,9 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn,
> void *cb_arg)
>  	/* now go through list, removing entries not at start */
>  	LIST_FOREACH(ap, &alarm_list, next) {
>  		/* this won't be true first time through */
> -		if (cb_fn == ap->cb_fn &&  ap->executing == 0 &&
> +		if (cb_fn == ap->cb_fn &&
>  				(cb_arg == (void *)-1 || cb_arg == ap->cb_arg))
> {
> -			LIST_REMOVE(ap,next);
> -			rte_free(ap);
> +			ap->executing |= ALARM_CANCELLED;
>  			count++;
>  			ap = ap_prev;
>  		}

Pawel


More information about the dev mailing list