[dpdk-dev] [PATCH v3 3/9] ring: introduce RTS ring mode

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Tue Apr 14 17:58:13 CEST 2020


<snip>
> Subject: RE: [PATCH v3 3/9] ring: introduce RTS ring mode
> 
> > > > >
> > > > > +#ifdef ALLOW_EXPERIMENTAL_API
> > > > > +#include <rte_ring_rts.h>
> > > > > +#endif
> > > > > +
> > > > >  /**
> > > > >   * Enqueue several objects on a ring.
> > > > >   *
> > > > > @@ -484,8 +520,21 @@ static __rte_always_inline unsigned int
> > > > > rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
> > > > >  		      unsigned int n, unsigned int *free_space)  {
> > > > > -	return __rte_ring_do_enqueue(r, obj_table, n,
> > > > > RTE_RING_QUEUE_FIXED,
> > > > > -			r->prod.sync_type, free_space);
> > > > > +	switch (r->prod.sync_type) {
> > > > > +	case RTE_RING_SYNC_MT:
> > > > > +		return rte_ring_mp_enqueue_bulk(r, obj_table, n,
> free_space);
> > > > > +	case RTE_RING_SYNC_ST:
> > > > > +		return rte_ring_sp_enqueue_bulk(r, obj_table, n,
> free_space);
> > > > Have you validated if these affect the performance for the existing APIs?
> > >
> > > I run ring_pmd_perf_autotest
> > > (AFAIK, that's the only one of our perf tests that calls
> > > rte_ring_enqueue/dequeue), and didn't see any real difference in
> > > perf numbers.
> > >
> > > > I am also wondering why should we support these new modes in the
> > > > legacy
> > > APIs?
> > >
> > > Majority of DPDK users still do use legacy API, and I am not sure
> > > all of them will be happy to switch to _elem_ one manually.
> > > Plus I can't see how we can justify that after let say:
> > > rte_ring_init(ring, ..., RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ);
> > > returns with success valid call to rte_ring_enqueue(ring,...) should fail.
> > Agree, I think the only way right now is through documentation.
> >
> > >
> > > > I think users should move to use rte_ring_xxx_elem APIs. If users
> > > > want to
> > > use RTS/HTS it will be a good time for them to move to new APIs.
> > >
> > > If they use rte_ring_enqueue/dequeue all they have to do - just
> > > change flags in ring_create/ring_init call.
> > > With what you suggest - they have to change every
> > > rte_ring_enqueue/dequeue to rte_ring_elem_enqueue/dequeue.
> > > That's much bigger code churn.
> > But these are just 1 to 1 mapped.  I would think, there are not a whole lot of
> them in the application code, may be ~10 lines?
> 
> I suppose it depends a lot on particular user app.
> My preference not to force users to do extra changes in their code.
> If we can add new functionality while keeping existing API, why not to do it?
> Less disturbance for users seems a good thing to me.
> 
> > I think the bigger factor for the user here is the algorithm changes
> > in rte_ring library. Bigger effort for the users would be testing rather than
> code changes in the applications.
> > >
> > > > They anyway have to test their code for RTS/HTS, might as well
> > > > make the
> > > change to new APIs and test both.
> > > > It will be less code to maintain for the community as well.
> > >
> > > That's true, right now there is a lot of duplication between _elem_
> > > and legacy code.
> > >  Actually the only real diff between them - actual copying of the objects.
> > >  But I thought we are going to deal with that, just by changing one
> > > day all legacy API to wrappers around _elem_ calls, i.e something like:
> > >
> > > static __rte_always_inline unsigned int rte_ring_enqueue_bulk(struct
> > > rte_ring *r, void * const *obj_table,
> > >                       unsigned int n, unsigned int *free_space) {
> > > 	return  rte_ring_enqueue_elem_bulk(r, obj_table, sizeof(uintptr_t),
> > > n, free_space); }
> > >
> > > That way users will switch to new API automatically, without any
> > > extra effort for them, and we will be able to remove legacy code.
> > > Do you have some other thoughts here how to deal with this
> > > legacy/elem conversion?
> > Yes, that is what I was thinking, but had not considered any addition of new
> APIs.
> > But, I am wondering if we should look at deprecation?
> 
> You mean to deprecate existing  legacy API?
> rte_ring_enqueue/dequeue_bulk, etc?
> I don't think we need to deprecate it at all.
> As long as we'll have _elem_  functions called underneath there would be one
> implementation anyway, and we can leave them forever, so users wouldn't
> need to change their existing code at all.
Ack (assuming that the legacy APIs will be wrappers)

> 
> > If we decide to deprecate, it would be good to avoid making the users
> > of RTS/HTS do the work twice (once to make the switch to RTS/HTS and
> then another to _elem_ APIs).
> >
> > One thing we can do is to implement the wrappers you mentioned above
> for RTS/HTS now.
> 
> That's a very good point.
>  It will require some re-org to allow rte_ring.h to include rte_ring_elem.h, but
> I think it is doable, will try to make these changes in v4.
> 
> > I also it is worth considering to switch to these wrappers 20.05 so
> > that come 20.11, we have a code base that has gone through couple of
> releases' testing.
> 
> You mean wrappers for existing legacy API (MP/MC, SP/SC modes)?
> It is probably too late to make such changes in 20.05, probably early 20.08 is
> a good time for that.
Yes, will target for 20.08

> 
> >
> > <snip>
> >
> > > > > +
> > > > > +#endif /* _RTE_RING_RTS_ELEM_H_ */
> > > > > diff --git a/lib/librte_ring/rte_ring_rts_generic.h
> > > > > b/lib/librte_ring/rte_ring_rts_generic.h
> > > > > new file mode 100644
> > > > > index 000000000..f88460d47
> > > > > --- /dev/null
> > > > > +++ b/lib/librte_ring/rte_ring_rts_generic.h
> > > > I do not know the benefit to providing the generic version. Do you
> > > > know
> > > why this was done in the legacy APIs?
> > >
> > > I think at first we had generic API only, then later C11 was added.
> > > As I remember, C11 one on IA was measured as a bit slower then
> > > generic, so it was decided to keep both.
> > >
> > > > If there is no performance difference between generic and C11
> > > > versions,
> > > should we just skip the generic version?
> > > > The oldest compiler in CI are GCC 4.8.5 and Clang 3.4.2 and C11
> > > > built-ins
> > > are supported earlier than these compiler versions.
> > > > I feel the code is growing exponentially in rte_ring library and
> > > > we should try
> > > to cut non-value-ass code/APIs aggressively.
> > >
> > > I'll check is there perf difference for RTS and HTS between generic
> > > and C11 versions on IA.
> > > Meanwhile please have a proper look at C11 implementation, I am not
> > > that familiar with C11 atomics yet.
> > ok
> >
> > > If there would be no problems with it and no noticeable diff in
> > > performance - I am ok to have for RTS/HTS modes C11 version only.
> 
> From what I see on my box, there is no much difference in terms of
> performance between *generic* and *c11_mem* for RTS/HTS.
> ring_stress_autotest for majority of cases shows ~1% diff, in some cases c11
> numbers are even a bit better.
> So will keep c11 version only in v4.
Thanks. That will remove good amount of code.


More information about the dev mailing list