[dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue

Bruce Richardson bruce.richardson at intel.com
Tue Mar 29 18:04:42 CEST 2016


On Tue, Mar 29, 2016 at 05:29:12PM +0200, Olivier MATZ wrote:
> Hi,
> 
> 
> On 03/29/2016 10:54 AM, Bruce Richardson wrote:
> >On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote:
> >>Hi Olivier,
> >>
> >>We could have two threads (running on different cores in the general
> >>case) that both succeed the cmpset operation. In the dequeue path,
> >>when n == 0, then cons_next == cons_head, and cmpset will always
> >>succeed. Now, if they both see an old r->cons.tail value from a
> >>previous dequeue, they can get stuck in the while
> >
> >Hi,
> >
> >I don't see how threads reading an "old r->cons.tail" value is even possible.
> >The head and tail pointers on the ring are marked in the code as volatile, so
> >all reads and writes to those values are always done from memory and not cached
> >in registers. No deadlock should be possible on that while loop, unless a
> >process crashes in the middle of a ring operation. Each thread which updates
> >the head pointer from x to y, is responsible for updating the tail pointer in
> >a similar manner. The loop ensures the tail updates are in the same order as the
> >head updates.
> >
> >If you believe deadlock is possible, can you outline the sequence of operations
> >which would lead to such a state, because I cannot see how it could occur without
> >a crash inside one of the threads.
> 
> I think the deadlock Lazaros describes could occur in the following
> condition:
> 
> current ring state
>  r->prod.head = 0
>  r->prod.tail = 0
> 
> core 0                               core 1
> ====================================================================
> enqueue 0 object
>  cmpset(&r->prod.head, 0, 0)
>  core 0 is interrupted here
>                                      enqueue 1 object
>                                       cmpset(&r->prod.head, 0, 1)
>                                       copy the objects in box 0
>                                       while (r->prod.tail != prod_head))
>                                       r->prod.tail = prod_next
>  copy 0 object (-> nothing to do)
>  while (r->prod.tail != prod_head))
>     <loop forever>
> 
> 
> I think this issue is indeed fixed by Lazaros' patch (I missed it
> in previous review). However, I don't think this deadlock could
> happen once we avoided the (n == 0) case.
>
Yes, I agree with your scenario. However, I thought the issue was occuring with
non-zero updates, but I may well be mistaken.
If it's fixed now, all good... :-)

/Bruce


More information about the dev mailing list