[dpdk-dev] [RFC 2/3] tqs: add thread quiescent state library

Ananyev, Konstantin konstantin.ananyev at intel.com
Wed Dec 12 10:29:25 CET 2018


> > >
> > > > > +
> > > > > +/* Add a reader thread, running on an lcore, to the list of
> > > > > +threads
> > > > > + * reporting their quiescent state on a TQS variable.
> > > > > + */
> > > > > +int __rte_experimental
> > > > > +rte_tqs_register_lcore(struct rte_tqs *v, unsigned int lcore_id) {
> > > > > +	TQS_RETURN_IF_TRUE((v == NULL || lcore_id >=
> > > > RTE_TQS_MAX_LCORE),
> > > > > +				-EINVAL);
> > > >
> > > > It is not very good practice to make function return different
> > > > values and behave in a different way in debug/non-debug mode.
> > > > I'd say that for slow-path (functions in .c) it is always good to
> > > > check input parameters.
> > > > For fast-path (functions in .h) we sometimes skip such checking, but
> > > > debug mode can probably use RTE_ASSERT() or so.
> > > Makes sense, I will change this in the next version.
> > >
> > > >
> > > >
> > > > lcore_id >= RTE_TQS_MAX_LCORE
> > > >
> > > > Is this limitation really necessary?
> > > I added this limitation because currently DPDK application cannot take
> > > a mask more than 64bit wide. Otherwise, this should be as big as
> > RTE_MAX_LCORE.
> > > I see that in the case of '-lcores' option, the number of lcores can
> > > be more than the number of PEs. In this case, we still need a MAX limit (but
> > can be bigger than 64).
> > >
> > > > First it means that only lcores can use that API (at least data-path
> > > > part), second even today many machines have more than 64 cores.
> > > > I think you can easily avoid such limitation, if instead of
> > > > requiring lcore_id as input parameter, you'll just make it return index of
> > next available entry in w[].
> > > > Then tqs_update() can take that index as input parameter.
> > > I had thought about a similar approach based on IDs. I was concerned
> > > that ID will be one more thing to manage for the application. But, I see the
> > limitations of the current approach now. I will change it to allocation based.
> > This will support even non-EAL pthreads as well.
> >
> > Yes, with such approach non-lcore threads will be able to use it also.
> >
> I realized that rte_tqs_register_lcore/ rte_tqs_unregister_lcore need to be efficient as they can be called from the worker's packet
> processing loop (rte_event_dequeue_burst allows blocking. So, the worker thread needs to call rte_tqs_unregister_lcore before calling
> rte_event_dequeue_burst and rte_tqs_register_lcore before starting packet processing). Allocating the thread ID in these functions will
> make them more complex.
> 
> I suggest that we change the argument 'lcore_id' to 'thread_id'. The application could use 'lcore_id' as 'thread_id' if threads are mapped to
> physical cores 1:1.
> 
> If the threads are not mapped 1:1 to physical cores, the threads need to use a thread_id in the range of 0 - RTE_TQS_MAX_THREADS. I do
> not see that DPDK has a thread_id concept. For TQS, the thread IDs are global (i.e. not per TQS variable). I could provide APIs to do the
> thread ID allocation, but I think the thread ID allocation should not be part of this library. Such thread ID might be useful for other libraries.

I don't think there is any point to introduce new thread_id concept just for that library.
After all we already have a concept of lcore_id which pretty much serves the same purpose.
I still think that we need to either:
a) make register/unregister to work with any valid lcore_id (<=  RTE_MAX_LCORE)
b) make register/unregister to return index in w[]

For a) will need mask bigger than 64bits. 
b)  would allow to use data-path API by non-lcores threads too,
plus w[] would occupy less space, and check() might be faster.
Though yes, as a drawback, for b) register/unregister probably would need
extra 'while(CAS(...));' loop.
I suppose the question here do you foresee a lot of concurrent register/unregister
at data-path? 

> 
> <snip>
> 
> >
> > >
> > > >
> > > > > +
> > > > > +		while (lcore_mask) {
> > > > > +			l = __builtin_ctz(lcore_mask);
> > > > > +			if (v->w[l].cnt != t)
> > > > > +				break;
> > > >
> > > > As I understand, that makes control-path function progress dependent
> > > > on simultaneous invocation of data-path functions.
> > > I agree that the control-path function progress (for ex: how long to
> > > wait for freeing the memory) depends on invocation of the data-path
> > > functions. The separation of 'start', 'check' and the option not to block in
> > 'check' provide the flexibility for control-path to do some other work if it
> > chooses to.
> > >
> > > > In some cases that might cause control-path to hang.
> > > > Let say if data-path function wouldn't be called, or user invokes
> > > > control-path and data-path functions from the same thread.
> > > I agree with the case of data-path function not getting called. I
> > > would consider that as programming error. I can document that warning in
> > the rte_tqs_check API.
> >
> > Sure, it can be documented.
> > Though that means, that each data-path thread would have to do explicit
> > update() call for every tqs it might use.
> > I just think that it would complicate things and might limit usage of the library
> > quite significantly.
> Each data path thread has to report its quiescent state. Hence, each data-path thread has to call update() (similar to how
> rte_timer_manage() has to be called periodically on the worker thread).

I understand that.
Though that means that each data-path thread has to know explicitly what rcu vars it accesses.
Would be hard to adopt such API with rcu vars used inside some library.
But ok, as I understand people do use QSBR approach in their apps and
find it useful.

> Do you have any particular use case in mind where this fails?

Let say it means that library can't be used to add/del RX/TX ethdev callbacks
in a safe manner.

BTW, two side questions:
1) As I understand what you propose is very similar to QSBR main concept.
Wouldn't it be better to name it accordingly to avoid confusion (or at least document it somewhere).
I think someone else already raised that question.
2) Would QSBR be the only technique in that lib?
Any plans to add something similar to GP one too (with MBs at reader-side)?

> 
> >
> > >
> > > In the case of same thread calling both control-path and data-path
> > > functions, it would depend on the sequence of the calls. The following
> > sequence should not cause any hangs:
> > > Worker thread
> > > 1) 'deletes' an entry from a lock-free data structure
> > > 2) rte_tqs_start
> > > 3) rte_tqs_update
> > > 4) rte_tqs_check (wait == 1 or wait == 0)
> > > 5) 'free' the entry deleted in 1)
> >
> > That an interesting idea, and that should help, I think.
> > Probably worth to have {2,3,4} sequence as a new high level function.
> >
> Yes, this is a good idea. Such a function would be applicable only in the worker thread. I would prefer to leave it to the application to take
> care.

Yes, it would be applicable only to worker thread, but why we can't have a function for it?
Let say it could be 2 different functions: one doing {2,3,4} - for worker threads,
and second doing just {2,4} - for control threads.
Or it could be just one function that takes extra parameter: lcore_id/w[] index.
If it is some predefined invalid value (-1 or so), step #3 will be skipped.

Konstantin


More information about the dev mailing list