[dpdk-dev] [PATCH v2 1/2] timer: allow timer management in shared memory

Carrillo, Erik G erik.g.carrillo at intel.com
Fri Dec 7 20:21:55 CET 2018


Hi Stephen,

Thanks for the review.   Some responses in-line:

> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Friday, December 7, 2018 12:10 PM
> To: Carrillo, Erik G <erik.g.carrillo at intel.com>
> Cc: rsanford at akamai.com; jerin.jacob at caviumnetworks.com;
> pbhagavatula at caviumnetworks.com; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/2] timer: allow timer management in
> shared memory
> 
> On Fri,  7 Dec 2018 11:52:59 -0600
> Erik Gabriel Carrillo <erik.g.carrillo at intel.com> wrote:
> 
> > Currently, the timer library uses a per-process table of structures to
> > manage skiplists of timers presumably because timers contain arbitrary
> > function pointers whose value may not resolve properly in other
> > processes.
> >
> > However, if the same callback is used handle all timers, and that
> > callback is only invoked in one process, then it woud be safe to allow
> > the data structures to be allocated in shared memory, and to allow
> > secondary processes to modify the timer lists.  This would let timers
> > be used in more multi-process scenarios.
> >
> > The library's global variables are wrapped with a struct, and an array
> > of these structures is created in shared memory.  The original APIs
> > are updated to reference the zeroth entry in the array. This maintains
> > the original behavior for both primary and secondary processes since
> > the set intersection of their coremasks should be empty [1].  New APIs
> > are introduced to enable the allocation/deallocation of other entries
> > in the array.
> >
> > New variants of the APIs used to start and stop timers are introduced;
> > they allow a caller to specify which array entry should be used to
> > locate the timer list to insert into or delete from.
> >
> > Finally, a new variant of rte_timer_manage() is introduced, which
> > allows a caller to specify which array entry should be used to locate
> > the timer lists to process; it can also process multiple timer lists
> > per invocation.
> >
> > [1]
> > https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-
> p
> > rocess-limitations
> >
> > Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo at intel.com>
> 
> Makes sense but it looks to me like an ABI breakage. Experimental isn't going
> to work for this.

For APIs that existed prior to this patch, I've duplicated them in a "19.02" node in 
the map file;  I only marked new APIs as experimental.  I versioned each API in
order to maintain the prior interface as well.  I tested ABI compatibility
with devtools/validate-abi.sh; it reported no errors detected.  So I believe this
won't break the ABI, but if I need to change something I certainly will.

> 
> > +static uint32_t default_data_id;  // id set to zero automatically
> 
> C++ style comments are not allowed per DPDK coding style.
> Best to just drop the comment, it is stating the obvious.
> 

Sure - will do.

> > -/* Init the timer library. */
> > +static inline int
> > +timer_data_valid(uint32_t id)
> > +{
> > +	return !!(rte_timer_data_arr[id].internal_flags & FL_ALLOCATED); }
> 
> Don't need inline on static functions.
> ...
> 
> > +MAP_STATIC_SYMBOL(int rte_timer_manage(void),
> > +rte_timer_manage_v1902); BIND_DEFAULT_SYMBOL(rte_timer_manage,
> > +_v1902, 19.02);
> > +
> > +int __rte_experimental
> > +rte_timer_alt_manage(uint32_t timer_data_id,
> > +		     unsigned int *poll_lcores,
> > +		     int nb_poll_lcores,
> > +		     rte_timer_alt_manage_cb_t f)
> > +{
> > +	union rte_timer_status status;
> > +	struct rte_timer *tim, *next_tim, **pprev;
> > +	struct rte_timer *run_first_tims[RTE_MAX_LCORE];
> > +	unsigned int runlist_lcore_ids[RTE_MAX_LCORE];
> > +	unsigned int this_lcore = rte_lcore_id();
> > +	struct rte_timer *prev[MAX_SKIPLIST_DEPTH + 1];
> > +	uint64_t cur_time;
> > +	int i, j, ret;
> > +	int nb_runlists = 0;
> > +	struct rte_timer_data *data;
> > +	struct priv_timer *privp;
> > +	uint32_t poll_lcore;
> > +
> > +	TIMER_DATA_VALID_GET_OR_ERR_RET(timer_data_id, data, -
> EINVAL);
> > +
> > +	/* timer manager only runs on EAL thread with valid lcore_id */
> > +	assert(this_lcore < RTE_MAX_LCORE);
> > +
> > +	__TIMER_STAT_ADD(data->priv_timer, manage, 1);
> > +
> > +	if (poll_lcores == NULL) {
> > +		poll_lcores = (unsigned int []){rte_lcore_id()};
> 
> 
> This isn't going to be safe. It assigns poll_lcores to an array allocated on the
> stack.
> 

poll_lcores is allowed to be NULL when  rte_timer_alt_manage() is called for
convenience;  if it is NULL, then we create an array on the stack 
containing one item and point poll_lcores at it.  poll_lcores only needs to be
valid for the invocation of the function, so pointing to an array on the stack
seems fine.  Did I miss the point?

> > +
> > +	for (i = 0, poll_lcore = poll_lcores[i]; i < nb_poll_lcores;
> > +	     poll_lcore = poll_lcores[++i]) {
> > +		privp = &data->priv_timer[poll_lcore];
> > +
> > +		/* optimize for the case where per-cpu list is empty */
> > +		if (privp->pending_head.sl_next[0] == NULL)
> > +			continue;
> > +		cur_time = rte_get_timer_cycles();
> > +
> > +#ifdef RTE_ARCH_64
> > +		/* on 64-bit the value cached in the pending_head.expired
> will
> > +		 * be updated atomically, so we can consult that for a quick
> > +		 * check here outside the lock
> > +		 */
> > +		if (likely(privp->pending_head.expire > cur_time))
> > +			continue;
> > +#endif
> 
> 
> This code needs to be optimized so that application can call this at a very high
> rate without performance impact.


More information about the dev mailing list