[dpdk-dev] [RFC] ring: count and empty optimizations
    Ananyev, Konstantin 
    konstantin.ananyev at intel.com
       
    Thu Apr 30 17:36:07 CEST 2020
    
    
  
> 
> > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli at arm.com]
> > Sent: Thursday, April 30, 2020 3:12 AM
> >
> > <snip>
> >
> > >
> > > Hi Morten,
> > >
> > > On Tue, Apr 28, 2020 at 03:53:15PM +0200, Morten Brørup wrote:
> > > > Olivier (maintainer of the Ring),
> > >
> > > I'm not anymore, CC'ing Konstantin and Honnappa.
> > >
> > > > I would like to suggest a couple of minor optimizations to the ring
> > library.
> > > >
> > > >
> > > > 1. Testing if the ring is empty is as simple as comparing the
> > producer and
> > > consumer pointers:
> > > >
> > > > static inline int
> > > > rte_ring_empty(const struct rte_ring *r) {
> > > > -	return rte_ring_count(r) == 0;
> > > > +	uint32_t prod_tail = r->prod.tail;
> > > > +	uint32_t cons_tail = r->cons.tail;
> > > > +	return cons_tail == prod_tail;
> > > > }
> > > >
> > > > In theory, this optimization reduces the number of potential cache
> > misses
> > > from 3 to 2 by not having to read r->mask in rte_ring_count().
> > >
> > > This one looks correct to me.
> > >
> > >
> > > > 2. It is not possible to enqueue more elements than the capacity of
> > a ring,
> > > so the count function does not need to test if the capacity is
> > exceeded:
> > > >
> > > > static inline unsigned
> > > > rte_ring_count(const struct rte_ring *r) {
> > > > 	uint32_t prod_tail = r->prod.tail;
> > > > 	uint32_t cons_tail = r->cons.tail;
> > > > 	uint32_t count = (prod_tail - cons_tail) & r->mask;
> > > > -	return (count > r->capacity) ? r->capacity : count;
> > > > + 	return count;
> > > > }
> > > >
> > > > I cannot even come up with a race condition in this function where
> > the
> > > count would exceed the capacity. Maybe I missed something?
> > >
> > > Since there is no memory barrier, the order in which prod_tail and
> > cons_tail
> > > are fetched is not guaranteed. Or the thread could be interrupted by
> > the
> > > kernel in between.
> > The '__rte_ring_move_prod_head' function ensures that the distance
> > between prod.head and cons.tail is always within the capacity
> > irrespective of whether the consumers/producers are sleeping.
> 
> Yes, this is exactly what I was thinking.
> 
> The tails are the pointers after any updates, which is shown very nicely in the documentation.
> And certainly prod.tail will not move further ahead than prod.head.
> 
> So it makes sense using the tails outside the functions that move the pointers.
> 
> Olivier found the race I couldn't find:
> 1. The thread calls rte_ring_count(), and since there is no memory barrier it reads cons.tail, and has not yet read prod.tail.
> 2. Other threads use the ring simultaneously. A consumer thread moves cons.tail ahead and a producer thread then moves prod.tail ahead.
> Note: Without first moving cons.tail ahead, prod.tail cannot move too far ahead.
> 3. The thread proceeds to read prod.tail. Now the count is completely incorrect, and may even exceed the capacity.
> 
> Olivier pointed out that this could happen if the rte_ring_count thread is interrupted by the kernel, and I agree. However, intuitively I don't
> think that it can happen in a non-EAL thread, because the consumer thread needs to finish moving cons.tail before the producer thread can
> move prod.tail too far ahead. And by then the rte_ring_count thread has had plenty of time to read prod.tail. But it could happen in theory.
> 
> > > This function may probably return invalid results in MC/MP cases.
> > > We just ensure that the result is between 0 and r->capacity.
> 
> So should we update the documentation to say that it might return an incorrect count (if there is a race), or should we fix the function to
> always provide a correct value?
As long as you invoke rte_ring_count() while there are other
active producers/consumers for that ring -
it's return value can always be outdated, and not reflect current ring state.
So I think just updating the doc is enough.
> 
> Furthermore, the same race condition probably affects rte_ring_empty() similarly, even in my improved version.
> 
> And do these functions need to support non-EAL threads? I don't think so. What do you think?
Not sure why you differ EAL and non EAL threads here.
The only difference between them from scheduler point of view -
EAL threads have cpu affinity set by rte_eal_init().
But nothing prevents user from setting/updating cpu affinity
for any thread in his process in a way he likes.
    
    
More information about the dev
mailing list