[dpdk-dev] [PATCH v2] lib/timer: relax barrier for status update

Phil Yang Phil.Yang at arm.com
Fri Apr 24 09:27:02 CEST 2020


> -----Original Message-----
> From: Carrillo, Erik G <erik.g.carrillo at intel.com>
> Sent: Friday, April 24, 2020 9:27 AM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>; Phil Yang
> <Phil.Yang at arm.com>; rsanford at akamai.com; dev at dpdk.org
> Cc: thomas at monjalon.net; david.marchand at redhat.com; Ananyev,
> Konstantin <konstantin.ananyev at intel.com>; jerinj at marvell.com;
> hemant.agrawal at nxp.com; Gavin Hu <Gavin.Hu at arm.com>; nd
> <nd at arm.com>; nd <nd at arm.com>
> Subject: RE: [PATCH v2] lib/timer: relax barrier for status update
> 
> Hi Honnappa,
> 
> > -----Original Message-----
> > From: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> > Sent: Thursday, April 23, 2020 3:06 PM
> > To: Phil Yang <Phil.Yang at arm.com>; Carrillo, Erik G
> > <erik.g.carrillo at intel.com>; rsanford at akamai.com; dev at dpdk.org
> > Cc: thomas at monjalon.net; david.marchand at redhat.com; Ananyev,
> > Konstantin <konstantin.ananyev at intel.com>; jerinj at marvell.com;
> > hemant.agrawal at nxp.com; Gavin Hu <Gavin.Hu at arm.com>; nd
> > <nd at arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>;
> > nd <nd at arm.com>
> > Subject: RE: [PATCH v2] lib/timer: relax barrier for status update
> >
> > Hi Erik,
> >
> > > Subject: [PATCH v2] lib/timer: relax barrier for status update
> > >
> > > Volatile has no ordering semantics. The rte_timer structure defines
> > > timer status as a volatile variable and uses the rte_r/wmb barrier to
> > > guarantee inter-thread visibility.
> > >
> > > This patch optimized the volatile operation with c11 atomic operations
> > > and one-way barrier to save the performance penalty. According to the
> > > timer_perf_autotest benchmarking results, this patch can uplift
> > > 10%~16% timer appending performance, 3%~20% timer resetting
> > > performance and 45% timer callbacks scheduling performance on aarch64
> > > and no loss in performance for x86.
> > >
> > > Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> > > Signed-off-by: Phil Yang <phil.yang at arm.com>
> > > Reviewed-by: Gavin Hu <gavin.hu at arm.com>
> > >
> > > ---
> > > This patch depends on patch:
> > > http://patchwork.dpdk.org/patch/65997/
> > >
> > > v2:
> > > 1. Changed the memory ordering comment in timer_set_config_state.
> > > 2. It is still using built-ins as the wrapper functions for C11
> > > built-ins are not defined yet.
> > It is too late to get the wrapper functions done for 20.05. It was decided in
> > yesterday's tech board meeting to go ahead with C11 atomic built-ins (since
> > there is lot of code in DPDK that uses C11 built-ins). If there are no further
> > comments, can you please provide your ack?
> >
> 
> Ok, thanks for letting me know.  Based on that decision,  I've taken another
> look
> and done some testing and it looks good to me.  I've made one comment in-
> line
> below and acked it.
> 
> <... snipped ...>
> 
> > > @@ -258,9 +257,15 @@ timer_set_config_state(struct rte_timer *tim,
> > >  		 * mark it atomically as being configured */
> > >  		status.state = RTE_TIMER_CONFIG;
> > >  		status.owner = (int16_t)lcore_id;
> > > -		success = rte_atomic32_cmpset(&tim->status.u32,
> > > -					      prev_status.u32,
> > > -					      status.u32);
> > > +		/* CONFIG states are acting as locked states. If the
> > > +		 * timer is in CONFIG state, the state cannot be changed
> > > +		 * by other threads. So, we should use ACQUIRE here.
> > > +		 */
> > > +		success = __atomic_compare_exchange_n(&tim-
> > >status.u32,
> > > +					      &prev_status.u32,
> > > +					      status.u32, 0,
> > > +					      __ATOMIC_ACQUIRE,
> > > +					      __ATOMIC_RELAXED);
> > >  	}
> > >
> > >  	ret_prev_status->u32 = prev_status.u32; @@ -279,20 +284,27 @@
> > > timer_set_running_state(struct rte_timer *tim)
> > >
> > >  	/* wait that the timer is in correct status before update,
> > >  	 * and mark it as running */
> > > -	while (success == 0) {
> > > -		prev_status.u32 = tim->status.u32;
> > > +	prev_status.u32 = __atomic_load_n(&tim->status.u32,
> > > __ATOMIC_RELAXED);
> > >
> > > +	while (success == 0) {
> > >  		/* timer is not pending anymore */
> > >  		if (prev_status.state != RTE_TIMER_PENDING)
> > >  			return -1;
> > >
> > >  		/* here, we know that timer is stopped or pending,
> 
> We know that the timer will be pending at this point... Since we're correcting
> the comment below, we can correct this part too.
> 
> With that change:
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo at intel.com>

Thanks Erik.
Updated in v3.

Thanks,
Phil 

> 
> > > -		 * mark it atomically as being configured */
> > > +		 * mark it atomically as being running
> > > +		 */
> > >  		status.state = RTE_TIMER_RUNNING;
> > >  		status.owner = (int16_t)lcore_id;
> > > -		success = rte_atomic32_cmpset(&tim->status.u32,
> > > -					      prev_status.u32,
> > > -					      status.u32);
> > > +		/* RUNNING states are acting as locked states. If the
> > > +		 * timer is in RUNNING state, the state cannot be changed
> > > +		 * by other threads. So, we should use ACQUIRE here.
> > > +		 */
> > > +		success = __atomic_compare_exchange_n(&tim-
> > >status.u32,
> > > +					      &prev_status.u32,
> > > +					      status.u32, 0,
> > > +					      __ATOMIC_ACQUIRE,
> > > +					      __ATOMIC_RELAXED);
> > >  	}
> > >
> > >  	return 0;
> 
> Thanks,
> Erik


More information about the dev mailing list