[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

Neil Horman nhorman at tuxdriver.com
Fri Sep 26 15:43:11 CEST 2014


On Fri, Sep 26, 2014 at 06:33:12AM +0000, Wodkowski, PawelX wrote:
> > Given what you said above, I agree, at least in the current implementation.  It
> > still seems like theres a simpler solution that doesn't require all the
> > comparative gymnastics.
> 
> Yes, there is simpler solution, but this solution involve recursive locking.
> DPDK recursive spinlocks are no an option in here, so only option is posix recursive
> mutex, which I think is even worst option than this gymnastics.
> 
I agree, lets avoid more locking if we can.

> > 
> > What if, instead of testing if you're the callback thread, we turn the executing
> > field of alarm_entry into a bitfield, where bit 0 represents the former
> > "executing" state, and bit 1 is defined as a "cancelled" bit.  Then
> > rte_eal_alarm_cancel becomes a search that, when an alarm is found simply or's
> > in the cancelled bit to the executing bit field.  When the callback thread runs,
> > it skips executing any alarm that is marked as cancelled, but frees all alarm
> > entries that are executed or cancelled.  That gives us a single point at which
> > frees of alarm entires happen?  Something like the patch below (completely
> > untested)?
> > 
> > It also seems like the alarm api as a whole could use some improvement.  The
> > way its written right now, theres no way to refer to a specific alarm (i.e.
> > cancelation relies on the specification of a function and data pointer, which
> > may refer to multiple timers).  Shouldn't rte_eal_alarm_set return an opaque
> > handle to a unique timer instance that can be store by a caller and used to
> > specfically cancel that timer?  Thats how both the bsd and linux timer
> > subsystems model timers.
> > 
> 
> Goal was to not break user applications that use this library.
> 
You break API all the time, why are you worried about it here?  I'm all for
maintaining API definately, but once my ABI versioning code gets into place we
can manage this alot better.

> > 
> > 
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_alarm.c
> > b/lib/librte_eal/linuxapp/eal/eal_alarm.c
> > index 480f0cb..73b6dc5 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_alarm.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_alarm.c
> > @@ -64,6 +64,9 @@
> >  #define MS_PER_S 1000
> >  #define US_PER_S (US_PER_MS * MS_PER_S)
> > 
> > +#define ALARM_EXECUTING (1 << 0)
> > +#define ALARM_CANCELLED (1 << 1)
> > +
> >  struct alarm_entry {
> >  	LIST_ENTRY(alarm_entry) next;
> >  	struct timeval time;
> > @@ -107,12 +110,14 @@ eal_alarm_callback(struct rte_intr_handle *hdl
> > __rte_unused,
> >  			gettimeofday(&now, NULL) == 0 &&
> >  			(ap->time.tv_sec < now.tv_sec || (ap->time.tv_sec ==
> > now.tv_sec &&
> >  						ap->time.tv_usec <=
> > now.tv_usec))){
> > -		ap->executing = 1;
> > -		rte_spinlock_unlock(&alarm_list_lk);
> 
> Removing unlock here introduce deadlock.
> 
Please look more closely, I've not removed anything, only moved where the lock
occurs.

> > +		ap->executing |= ALARM_EXECUTING;
> > +		if (likely(!(ap->executing & ALARM_CANCELLED)) {
> > +			rte_spinlock_unlock(&alarm_list_lk);
The unlock is now here, conditional on needing to call the callback.

> > 
> > -		ap->cb_fn(ap->cb_arg);
> > +			ap->cb_fn(ap->cb_arg);
> > 
> > -		rte_spinlock_lock(&alarm_list_lk);
> > +			rte_spinlock_lock(&alarm_list_lk);
> > +		}
> >  		LIST_REMOVE(ap, next);
> >  		rte_free(ap);
> >  	}
> > @@ -209,10 +214,9 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn,
> > void *cb_arg)
> >  	rte_spinlock_lock(&alarm_list_lk);
> >  	/* remove any matches at the start of the list */
> >  	while ((ap = LIST_FIRST(&alarm_list)) != NULL &&
> > -			cb_fn == ap->cb_fn && ap->executing == 0 &&
> > +			cb_fn == ap->cb_fn &&
> >  			(cb_arg == (void *)-1 || cb_arg == ap->cb_arg)) {
> > -		LIST_REMOVE(ap, next);
> > -		rte_free(ap);
> > +		ap->executing |= ALARM_CANCELLED;
> >  		count++;
> >  	}
> >  	ap_prev = ap;
> > @@ -220,10 +224,9 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn,
> > void *cb_arg)
> >  	/* now go through list, removing entries not at start */
> >  	LIST_FOREACH(ap, &alarm_list, next) {
> >  		/* this won't be true first time through */
> > -		if (cb_fn == ap->cb_fn &&  ap->executing == 0 &&
> > +		if (cb_fn == ap->cb_fn &&
> >  				(cb_arg == (void *)-1 || cb_arg == ap->cb_arg))
> > {
> > -			LIST_REMOVE(ap,next);
> > -			rte_free(ap);
> > +			ap->executing |= ALARM_CANCELLED;
> >  			count++;
> >  			ap = ap_prev;
> >  		}
> 
> Pawel
> 


More information about the dev mailing list