[dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Mar 5 19:26:30 CET 2020


> >
> > > > > +/**
> > > > > + * @internal Reserve ring elements to enqueue several objects on
> > > > > +the ring
> > > > > + *
> > > > > + * @param r
> > > > > + *   A pointer to the ring structure.
> > > > > + * @param esize
> > > > > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > > > > + *   This must be the same value used while creating the ring.
> > Otherwise
> > > > > + *   the results are undefined.
> > > > > + * @param n
> > > > > + *   The number of elements to reserve in the ring.
> > > > > + * @param behavior
> > > > > + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of elements
> > from a
> > > > ring
> > > > > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as
> > possible
> > > > from ring
> > > > > + * @param is_sp
> > > > > + *   Indicates whether to use single producer or multi-producer reserve
> > > > > + * @param old_head
> > > > > + *   Producer's head index before reservation.
> > > > > + * @param new_head
> > > > > + *   Producer's head index after reservation.
> > > > > + * @param free_space
> > > > > + *   returns the amount of space after the reserve operation has
> > finished.
> > > > > + *   It is not updated if the number of reserved elements is zero.
> > > > > + * @param dst1
> > > > > + *   Pointer to location in the ring to copy the data.
> > > > > + * @param n1
> > > > > + *   Number of elements to copy at dst1
> > > > > + * @param dst2
> > > > > + *   In case of ring wrap around, this pointer provides the location to
> > > > > + *   copy the remaining elements. The number of elements to copy at
> > this
> > > > > + *   location is equal to (number of elements reserved - n1)
> > > > > + * @return
> > > > > + *   Actual number of elements reserved.
> > > > > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > > > > + */
> > > > > +static __rte_always_inline unsigned int
> > > > > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r, unsigned
> > > > > +int esize,
> > > >
> > > >
> > > > I do understand the purpose of reserve, then either commit/abort for
> > > > serial sync mode, but what is the purpose of non-serial version of
> > reserve/commit?
> > > In RCU, I have the need for scatter-gather feature. i.e. the data in
> > > the ring element is coming from multiple sources ('token' is generated
> > > by the RCU library and the application provides additional data). If I do not
> > provide the reserve/commit, I need to introduce an intermediate memcpy to
> > get these two data contiguously to copy to the ring element. The sequence is
> > 'reserve(1), memcpy1, mempcy2, commit(1)'.
> > > Hence, you do not see the abort API for the enqueue.
> > >
> > > > In serial  MP/MC case, after _reserve_(n) you always have to do
> > > > _commit_(n) - you can't reduce number of elements, or do _abort_.
> > > Agree, the intention here is to provide the scatter/gather feature.
> > >
> > > > Again you cannot avoid memcpy(n) here anyhow.
> > > > So what is the point of these functions for non-serial case?
> > > It avoids an intermediate memcpy when the data is coming from multiple
> > sources.
> >
> > Ok, I think I understand what was my confusion:
> Yes, the following understanding is correct.
> 
> > Your intention:
> >  1) reserve/commit for both serial and non-serial mode -
> >      to allow user get/set contents of the ring manually and avoid
> >      intermediate load/stores.
> > 2) abort only for serial mode.
> >
> > My intention:
> > 1) commit/reserve/abort only for serial case
> >     (as that's the only mode where we can commit less
> >      then was reserved or do abort).
> I do not know if there is a requirement on committing less than reserved.

>From my perspective, that's a necessary part of peek functionality.
revert/abort function you introduced below is just one special case of it.
Having just abort is enough when you processing elements in the ring one by one,
but not sufficient if someone would try to operate in bulks.
Let say you read (reserved) N objects from the ring, inspected them
and found that first M (<N) are ok to be removed from the ring,
others should remain.

> I think, if the size of commit is not known during reservation,
> may be the reservation can be delayed till it is known.

In some cases, you do know how much you'd like to commit,
but you can't guarantee that you can commit that much,
till you inspect contents of reserved elems.  

> If there is no requirement to commit less than reserved, then I do not see a need for serial APIs for enqueue operation.

> 
> > 2) get/set of ring contents are done as part of either
> >     reserve(for dequeue) or commit(for enqueue) API calls
> >     (no scatter-gather ability).
> >
> > I still think that this new API you suggest creates too big exposure of ring
> > internals, and makes it less 'safe-to-use':
> > - it provides direct access to contents of the ring.
> > - user has to specify head/tail values directly.
> It is some what complex. But, with the support of user defined element size, I think it becomes necessary to support scatter gather
> feature (since it is not a single pointer that will be stored).

I suppose to see the real benefit from scatter-gather, we need a scenario
where there are relatively big elems in the ring (32B+ or so),
plus enqueue/dequeue done in bulks.
If you really  envision such use case - I am ok to consider scatter-gather API too,
but I think it shouldn't be the only available API for serial mode.
Might be we can have 'normal' enqueue/dequeue API for serial mode
(actual copy done internally in ring functions, head/tail values are not exposed directly),
plus SG API as addon for some special cases.  

> >
> > So in case of some programmatic error in related user code, there are less
> > chances it could be catch-up by API, and we can easily end-up with silent
> > memory corruption and other nasty things that would be hard to
> > catch/reproduce.
> >
> > That makes me wonder how critical is this scatter-gather ability in terms of
> > overall RCU performance?
> > Is the gain provided really that significant, especially if you'll update the ring
> > by one element at a time?
> For RCU, it is 64b token and the size of the user data. Not sure how much difference it will make.
> I can drop the scatter gather requirement for now.
> 
> >
> > >
> > > >
> > > > BTW, I think it would be good to have serial version of _enqueue_ too.
> > > If there is a good use case, they should be provided. I did not come across a
> > good use case.
> > >
> > > >
> > > > > +		unsigned int n, enum rte_ring_queue_behavior behavior,
> > > > > +		unsigned int is_sp, unsigned int *old_head,
> > > > > +		unsigned int *new_head, unsigned int *free_space,
> > > > > +		void **dst1, unsigned int *n1, void **dst2)
> > > >
> > > > I do understand the intention to avoid memcpy(), but proposed API
> > > > seems overcomplicated, error prone, and not very convenient for the user.
> > > The issue is the need to handle the wrap around in ring storage array.
> > > i.e. when the space is reserved for more than 1 ring element, the wrap
> > around might happen.
> > >
> > > > I don't think that avoiding memcpy() will save us that many cycles
> > > > here, so
> > > This depends on the amount of data being copied.
> > >
> > > > probably better to keep API model a bit more regular:
> > > >
> > > > n = rte_ring_mp_serial_enqueue_bulk_reserve(ring, num, &free_space); ...
> > > > /* performs actual memcpy(), m<=n */
> > > > rte_ring_mp_serial_enqueue_bulk_commit(ring, obj,  m);
> > > These do not take care of the wrap-around case or I am not able to
> > understand your comment.
> >
> > I meant that serial_enqueue_commit() will do both:
> > actual copy of elements to the ring and tail update (no Scatter-Gather), see
> > above.
> RCU does not require the serial enqueue APIs, do you have any use case?

I agree that serial dequeue seems to have more usages then enqueue.
Though I still can name at least two cases for enqueue, from top of my head:
1. serial mode (both enqueue/dequeue) helps to mitigate ring slowdown 
overcommitted scenarios, see RFC I submitted:
http://patches.dpdk.org/cover/66001/
2. any intermediate node when you have pop/push from/to some external queue,
and enqueue/dequeue to/from the ring, would like to avoid any elem
drops in between, and by some reason don't want your own intermediate bufferization.
Let say:
dequeue_from_ring -> tx_burst/cryptodev_enqueue
rx_burst/cryptodev_dequeue -> enqueue_to_ring

Plus as enqueue/dequeue are sort of mirror, I think it is good to have both identical.
   




More information about the dev mailing list