[dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Wed Mar 25 21:43:47 CET 2020


<snip>

> 
> > >
> > > > > > +/**
> > > > > > + * @internal Reserve ring elements to enqueue several objects
> > > > > > +on the ring
> > > > > > + *
> > > > > > + * @param r
> > > > > > + *   A pointer to the ring structure.
> > > > > > + * @param esize
> > > > > > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > > > > > + *   This must be the same value used while creating the ring.
> > > Otherwise
> > > > > > + *   the results are undefined.
> > > > > > + * @param n
> > > > > > + *   The number of elements to reserve in the ring.
> > > > > > + * @param behavior
> > > > > > + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of
> elements
> > > from a
> > > > > ring
> > > > > > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as
> > > possible
> > > > > from ring
> > > > > > + * @param is_sp
> > > > > > + *   Indicates whether to use single producer or multi-producer
> reserve
> > > > > > + * @param old_head
> > > > > > + *   Producer's head index before reservation.
> > > > > > + * @param new_head
> > > > > > + *   Producer's head index after reservation.
> > > > > > + * @param free_space
> > > > > > + *   returns the amount of space after the reserve operation has
> > > finished.
> > > > > > + *   It is not updated if the number of reserved elements is zero.
> > > > > > + * @param dst1
> > > > > > + *   Pointer to location in the ring to copy the data.
> > > > > > + * @param n1
> > > > > > + *   Number of elements to copy at dst1
> > > > > > + * @param dst2
> > > > > > + *   In case of ring wrap around, this pointer provides the location
> to
> > > > > > + *   copy the remaining elements. The number of elements to copy
> at
> > > this
> > > > > > + *   location is equal to (number of elements reserved - n1)
> > > > > > + * @return
> > > > > > + *   Actual number of elements reserved.
> > > > > > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > > > > > + */
> > > > > > +static __rte_always_inline unsigned int
> > > > > > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r,
> > > > > > +unsigned int esize,
> > > > >
> > > > >
> > > > > I do understand the purpose of reserve, then either commit/abort
> > > > > for serial sync mode, but what is the purpose of non-serial
> > > > > version of
> > > reserve/commit?
> > > > In RCU, I have the need for scatter-gather feature. i.e. the data
> > > > in the ring element is coming from multiple sources ('token' is
> > > > generated by the RCU library and the application provides
> > > > additional data). If I do not
> > > provide the reserve/commit, I need to introduce an intermediate
> > > memcpy to get these two data contiguously to copy to the ring
> > > element. The sequence is 'reserve(1), memcpy1, mempcy2, commit(1)'.
> > > > Hence, you do not see the abort API for the enqueue.
> > > >
> > > > > In serial  MP/MC case, after _reserve_(n) you always have to do
> > > > > _commit_(n) - you can't reduce number of elements, or do _abort_.
> > > > Agree, the intention here is to provide the scatter/gather feature.
> > > >
> > > > > Again you cannot avoid memcpy(n) here anyhow.
> > > > > So what is the point of these functions for non-serial case?
> > > > It avoids an intermediate memcpy when the data is coming from
> > > > multiple
> > > sources.
> > >
> > > Ok, I think I understand what was my confusion:
> > Yes, the following understanding is correct.
> >
> > > Your intention:
> > >  1) reserve/commit for both serial and non-serial mode -
> > >      to allow user get/set contents of the ring manually and avoid
> > >      intermediate load/stores.
> > > 2) abort only for serial mode.
> > >
> > > My intention:
> > > 1) commit/reserve/abort only for serial case
> > >     (as that's the only mode where we can commit less
> > >      then was reserved or do abort).
> > I do not know if there is a requirement on committing less than reserved.
> 
> From my perspective, that's a necessary part of peek functionality.
> revert/abort function you introduced below is just one special case of it.
> Having just abort is enough when you processing elements in the ring one by
> one, but not sufficient if someone would try to operate in bulks.
> Let say you read (reserved) N objects from the ring, inspected them and
> found that first M (<N) are ok to be removed from the ring, others should
> remain.
Agree, it makes sense from a dequeue perspective. Does it make sense from enqueue perspective?

> 
> > I think, if the size of commit is not known during reservation, may be
> > the reservation can be delayed till it is known.
> 
> In some cases, you do know how much you'd like to commit, but you can't
> guarantee that you can commit that much, till you inspect contents of
> reserved elems.
The above comment was from enqueue perspective.

> 
> > If there is no requirement to commit less than reserved, then I do not see a
> need for serial APIs for enqueue operation.
> 
> >
> > > 2) get/set of ring contents are done as part of either
> > >     reserve(for dequeue) or commit(for enqueue) API calls
> > >     (no scatter-gather ability).
> > >
> > > I still think that this new API you suggest creates too big exposure
> > > of ring internals, and makes it less 'safe-to-use':
> > > - it provides direct access to contents of the ring.
> > > - user has to specify head/tail values directly.
> > It is some what complex. But, with the support of user defined element
> > size, I think it becomes necessary to support scatter gather feature (since it
> is not a single pointer that will be stored).
> 
> I suppose to see the real benefit from scatter-gather, we need a scenario
> where there are relatively big elems in the ring (32B+ or so), plus
> enqueue/dequeue done in bulks.
> If you really  envision such use case - I am ok to consider scatter-gather API
> too, but I think it shouldn't be the only available API for serial mode.
> Might be we can have 'normal' enqueue/dequeue API for serial mode (actual
> copy done internally in ring functions, head/tail values are not exposed
> directly), plus SG API as addon for some special cases.
I will try to run some benchmarks and take a decision on if SG makes an impact on RCU defer APIs.

> 
> > >
> > > So in case of some programmatic error in related user code, there
> > > are less chances it could be catch-up by API, and we can easily
> > > end-up with silent memory corruption and other nasty things that
> > > would be hard to catch/reproduce.
> > >
> > > That makes me wonder how critical is this scatter-gather ability in
> > > terms of overall RCU performance?
> > > Is the gain provided really that significant, especially if you'll
> > > update the ring by one element at a time?
> > For RCU, it is 64b token and the size of the user data. Not sure how much
> difference it will make.
> > I can drop the scatter gather requirement for now.
> >
> > >
> > > >
> > > > >
> > > > > BTW, I think it would be good to have serial version of _enqueue_ too.
> > > > If there is a good use case, they should be provided. I did not
> > > > come across a
> > > good use case.
> > > >
> > > > >
> > > > > > +		unsigned int n, enum rte_ring_queue_behavior
> behavior,
> > > > > > +		unsigned int is_sp, unsigned int *old_head,
> > > > > > +		unsigned int *new_head, unsigned int *free_space,
> > > > > > +		void **dst1, unsigned int *n1, void **dst2)
> > > > >
> > > > > I do understand the intention to avoid memcpy(), but proposed
> > > > > API seems overcomplicated, error prone, and not very convenient for
> the user.
> > > > The issue is the need to handle the wrap around in ring storage array.
> > > > i.e. when the space is reserved for more than 1 ring element, the
> > > > wrap
> > > around might happen.
> > > >
> > > > > I don't think that avoiding memcpy() will save us that many
> > > > > cycles here, so
> > > > This depends on the amount of data being copied.
> > > >
> > > > > probably better to keep API model a bit more regular:
> > > > >
> > > > > n = rte_ring_mp_serial_enqueue_bulk_reserve(ring, num,
> &free_space); ...
> > > > > /* performs actual memcpy(), m<=n */
> > > > > rte_ring_mp_serial_enqueue_bulk_commit(ring, obj,  m);
> > > > These do not take care of the wrap-around case or I am not able to
> > > understand your comment.
> > >
> > > I meant that serial_enqueue_commit() will do both:
> > > actual copy of elements to the ring and tail update (no
> > > Scatter-Gather), see above.
> > RCU does not require the serial enqueue APIs, do you have any use case?
> 
> I agree that serial dequeue seems to have more usages then enqueue.
> Though I still can name at least two cases for enqueue, from top of my head:
> 1. serial mode (both enqueue/dequeue) helps to mitigate ring slowdown
> overcommitted scenarios, see RFC I submitted:
> http://patches.dpdk.org/cover/66001/
> 2. any intermediate node when you have pop/push from/to some external
> queue, and enqueue/dequeue to/from the ring, would like to avoid any
> elem drops in between, and by some reason don't want your own
> intermediate bufferization.
> Let say:
> dequeue_from_ring -> tx_burst/cryptodev_enqueue
> rx_burst/cryptodev_dequeue -> enqueue_to_ring
> 
> Plus as enqueue/dequeue are sort of mirror, I think it is good to have both
> identical.
Ok, agreed. I think we need to allow for combination of APIs to be used. i.e. MP enqueue and serialization on dequeue.

> 
> 



More information about the dev mailing list