[dpdk-dev] [PATCH v3 1/3] lib/ring: add peek API

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Fri Oct 11 07:03:59 CEST 2019


> 
> > <snip>
> >
> > >
> > > >
> > > > > > Subject: [PATCH v3 1/3] lib/ring: add peek API
> > > > > >
> > > > > > From: Ruifeng Wang <ruifeng.wang at arm.com>
> > > > > >
> > > > > > The peek API allows fetching the next available object in the
> > > > > > ring without dequeuing it. This helps in scenarios where
> > > > > > dequeuing of objects depend on their value.
> > > > > >
> > > > > > Signed-off-by: Dharmik Thakkar <dharmik.thakkar at arm.com>
> > > > > > Signed-off-by: Ruifeng Wang <ruifeng.wang at arm.com>
> > > > > > Reviewed-by: Honnappa Nagarahalli
> > > > > > <honnappa.nagarahalli at arm.com>
> > > > > > Reviewed-by: Gavin Hu <gavin.hu at arm.com>
> > > > > > ---
> > > > > >  lib/librte_ring/rte_ring.h | 30
> > > > > > ++++++++++++++++++++++++++++++
> > > > > >  1 file changed, 30 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_ring/rte_ring.h
> > > > > > b/lib/librte_ring/rte_ring.h index 2a9f768a1..d3d0d5e18 100644
> > > > > > --- a/lib/librte_ring/rte_ring.h
> > > > > > +++ b/lib/librte_ring/rte_ring.h
> > > > > > @@ -953,6 +953,36 @@ rte_ring_dequeue_burst(struct rte_ring
> > > > > > *r, void
> > > > > **obj_table,
> > > > > >  				r->cons.single, available);  }
> > > > > >
> > > > > > +/**
> > > > > > + * Peek one object from a ring.
> > > > > > + *
> > > > > > + * The peek API allows fetching the next available object in
> > > > > > +the ring
> > > > > > + * without dequeuing it. This API is not multi-thread safe
> > > > > > +with respect
> > > > > > + * to other consumer threads.
> > > > > > + *
> > > > > > + * @param r
> > > > > > + *   A pointer to the ring structure.
> > > > > > + * @param obj_p
> > > > > > + *   A pointer to a void * pointer (object) that will be filled.
> > > > > > + * @return
> > > > > > + *   - 0: Success, object available
> > > > > > + *   - -ENOENT: Not enough entries in the ring.
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +static __rte_always_inline int rte_ring_peek(struct rte_ring
> > > > > > +*r, void **obj_p)
> > > > >
> > > > > As it is not MT safe, then I think we need _sc_ in the name, to
> > > > > follow other rte_ring functions naming conventions
> > > > > (rte_ring_sc_peek() or so).
> > > > Agree
> > > >
> > > > >
> > > > > As a better alternative what do you think about introducing a
> > > > > serialized versions of DPDK rte_ring dequeue functions?
> > > > > Something like that:
> > > > >
> > > > > /* same as original ring dequeue, but:
> > > > >   * 1) move cons.head only if cons.head == const.tail
> > > > >   * 2) don't update cons.tail
> > > > >   */
> > > > > unsigned int
> > > > > rte_ring_serial_dequeue_bulk(struct rte_ring *r, void
> > > > > **obj_table, unsigned int n,
> > > > >                 unsigned int *available);
> > > > >
> > > > > /* sets both cons.head and cons.tail to cons.head + num */ void
> > > > > rte_ring_serial_dequeue_finish(struct rte_ring *r, uint32_t
> > > > > num);
> > > > >
> > > > > /* resets cons.head to const.tail value */ void
> > > > > rte_ring_serial_dequeue_abort(struct rte_ring *r);
> > > > >
> > > > > Then your dq_reclaim cycle function will look like that:
> > > > >
> > > > > const uint32_t nb_elt =  dq->elt_size/8 + 1; uint32_t avl, n;
> > > > > uintptr_t elt[nb_elt]; ...
> > > > >
> > > > > do {
> > > > >
> > > > >   /* read next elem from the queue */
> > > > >   n = rte_ring_serial_dequeue_bulk(dq->r, elt, nb_elt, &avl);
> > > > >   if (n == 0)
> > > > >       break;
> > > > >
> > > > >  /* wrong period, keep elem in the queue */  if
> > > > > (rte_rcu_qsbr_check(dr->v,
> > > > > elt[0]) != 1) {
> > > > >      rte_ring_serial_dequeue_abort(dq->r);
> > > > >      break;
> > > > >   }
> > > > >
> > > > >   /* can reclaim, remove elem from the queue */
> > > > >   rte_ring_serial_dequeue_finish(dr->q, nb_elt);
> > > > >
> > > > >    /*call reclaim function */
> > > > >   dr->f(dr->p, elt);
> > > > >
> > > > > } while (avl >= nb_elt);
> > > > >
> > > > > That way, I think even rte_rcu_qsbr_dq_reclaim() can be MT safe.
> > > > > As long as actual reclamation callback itself is MT safe of course.
> > > >
> > > > I think it is a great idea. The other writers would still be
> > > > polling for the current writer to update the tail or update the
> > > > head. This makes it a
> > > blocking solution.
> > >
> > > Yep, it is a blocking one.
> > >
> > > > We can make the other threads not poll i.e. they will quit
> > > > reclaiming if they
> > > see that other writers are dequeuing from the queue.
> > >
> > > Actually didn't think about that possibility, but yes should be
> > > possible to have _try_ semantics too.
> > >
> > > >The other  way is to use per thread queues.
> > > >
> > > > The other requirement I see is to support unbounded-size data
> > > > structures where in the data structures do not have a
> > > > pre-determined number of entries. Also, currently the defer queue
> > > > size is equal to the total
> > > number of entries in a given data structure. There are plans to
> > > support dynamically resizable defer queue. This means, memory
> > > allocation which will affect the lock-free-ness of the solution.
> > > >
> > > > So, IMO:
> > > > 1) The API should provide the capability to support different
> > > > algorithms -
> > > may be through some flags?
> > > > 2) The requirements for the ring are pretty unique to the problem
> > > > we have here (for ex: move the cons-head only if cons-tail is also
> > > > the same, skip
> > > polling). So, we should probably implement a ring with-in the RCU library?
> > >
> > > Personally, I think such serialization ring API would be useful for
> > > other cases too.
> > > There are few cases when user need to read contents of the queue
> > > without removing elements from it.
> > > Let say we do use similar approach inside TLDK to implement TCP
> > > transmit queue.
> > > If such API would exist in DPDK we can just use it straightway,
> > > without maintaining a separate one.
> > ok
> >
> > >
> > > >
> > > > From the timeline perspective, adding all these capabilities would
> > > > be difficult to get done with in 19.11 timeline. What I have here
> > > > satisfies my current needs. I suggest that we make provisions in
> > > > APIs now to
> > > support all these features, but do the implementation in the coming
> releases.
> > > Does this sound ok for you?
> > >
> > > Not sure I understand your suggestion here...
> > > Could you explain it a bit more - how new API will look like and
> > > what would be left for the future.
> > For this patch, I suggest we do not add any more complexity. If
> > someone wants a lock-free/block-free mechanism, it is available by creating
> per thread defer queues.
> >
> > We push the following to the future:
> > 1) Dynamically size adjustable defer queue. IMO, with this, the
> > lock-free/block-free reclamation will not be available (memory allocation
> requires locking). The memory for the defer queue will be allocated/freed in
> chunks of 'size' elements as the queue grows/shrinks.
> 
> That one is fine by me.
> In fact I don't know would be there a real use-case for dynamic defer queue
> for rcu var...
> But I suppose that's subject for another discussion.
Currently, the defer queue size is equal to the number of resources in the data structure. This is unnecessary as the reclamation is done regularly.
If a smaller queue size is used, the queue might get full (even after reclamation), in which case, the queue size should be increased.

> 
> >
> > 2) Constant size defer queue with lock-free and block-free reclamation
> > (single option). The defer queue will be of fixed length 'size'. If
> > the queue gets full an error is returned. The user could provide a 'size' equal
> to the number of elements in a data structure to ensure queue never gets full.
> 
> Ok so for 19.11 what enqueue/dequeue model do you plan to support?
> - MP/MC
> - MP/SC
> - SP/SC
Just SP/SC

> - non MT at all (only same single thread can do enqueue and dequeue)
If MT safe is required, one should use 1 defer queue per thread for now.

> 
> And related question:
> What additional rte_ring API you plan to introduce in that case?
> - None
> - rte_ring_sc_peek()
rte_ring_peek will be changed to rte_ring_sc_peek

> - rte_ring_serial_dequeue()
> 
> >
> > I would add a 'flags' field in rte_rcu_qsbr_dq_parameters and provide
> > 2 #defines, one for dynamically variable size defer queue and the other for
> constant size defer queue.
> >
> > However, IMO, using per thread defer queue is a much simpler way to
> achieve 2. It does not add any significant burden to the user either.
> >
> > >
> > > >
> > > > >
> > > > > > +{
> > > > > > +	uint32_t prod_tail = r->prod.tail;
> > > > > > +	uint32_t cons_head = r->cons.head;
> > > > > > +	uint32_t count = (prod_tail - cons_head) & r->mask;
> > > > > > +	unsigned int n = 1;
> > > > > > +	if (count) {
> > > > > > +		DEQUEUE_PTRS(r, &r[1], cons_head, obj_p, n, void *);
> > > > > > +		return 0;
> > > > > > +	}
> > > > > > +	return -ENOENT;
> > > > > > +}
> > > > > > +
> > > > > >  #ifdef __cplusplus
> > > > > >  }
> > > > > >  #endif
> > > > > > --
> > > > > > 2.17.1



More information about the dev mailing list