[PATCH 1/1] ring: safe partial ordering for head/tail update
    Ola Liljedahl 
    Ola.Liljedahl at arm.com
       
    Wed Sep 24 15:28:50 CEST 2025
    
    
  
> On 2025-09-24, 13:51, "Konstantin Ananyev" <konstantin.ananyev at huawei.com <mailto:konstantin.ananyev at huawei.com>> wrote:
>
> > > > > > Sure, I am talking about MT scenario.
> > > > > > I think I already provided an example: DPDK mempool library (see below).
> > > > > > In brief, It works like that:
> > > > > > At init it allocates ring of N memory buffers and ring big enough to hold all of
> > > > them.
> > > > >
> > > > > Sorry, I meant to say: "it allocates N memory buffers and ring big enough to
> > hold
> > > > all of them".
> > > > >
> > > > > > Then it enqueues all allocated memory buffers into the ring.
> > > > > > mempool_get - retrieves (dequeues) buffers from the ring.
> > > > > > mempool_put - puts them back (enqueues) to the ring
> > > > > > get() might fail (ENOMEM), while put is expected to always succeed.
> > > > But how does the thread which calls mempool_put() get hold of the memory
> > buffers
> > > > that
> > > > were obtained using mempool_get() by some other thread? Or this is not the
> > > > scenario you
> > > > are worrying about?
> > > > Is it rather that multiple threads independently call mempool_get() and then
> > > > mempool_put()
> > > > on their own buffers? And you are worried that a thread will fail to return
> > > > (mempool_put) a
> > > > buffer that it earlier allocated (mempool_get)? We could create a litmus test for
> > > > that.
> > >
> > >
> > > Both scenarios are possible.
> > > For Run-To-Completion model each thread usually does: allocate/use/free group
> > of mbufs.
> > > For pipleline model one thread can allocate bunch of mbufs, then pass them to
> > other
> > thread (via another ring for example) for further processing and then releasing.
> > In the pipeline model, if the last stage (thread) frees (enqueues) buffers onto some
> > ring buffer
> > and the first stage (thread) allocates (dequeues) buffers from the same ring buffer
> > but there
> > isn't any other type of synchronization between the threads, we can never guarantee
> > that
> > the first thread will be able to dequeue buffers because it doesn't know whether the
> > last
> > thread has enqueued any buffers.
>
>
> Yes, as I said above - for mempool use-case: dequeue can fail, enqueue should always succeed.
> The closest analogy: malloc() can fail, free() should never fail.
>
>
> >
> > However, enqueue ought to always succeed. We should be able to create a litmus
> > test for that.
> > Ring 1 is used as mempool, it initially contains capacity elements (full).
> > Ring 2 is used as pipe between stages 1 and 2, it initially contains 0 elements (empty).
> > Thread 1 allocates/dequeues a buffer from ring 1.
> > Thread 1 enqueues that buffer onto ring 2.
> > Thread 2 dequeues a buffer from ring 2.
> > Thread 2 frees/enqueues that buffer onto ring 1. <<< this must succeed!
> > Does this reflect the situation you worry about?
>
>
> This is one of the possible scenarios.
> As I said above - mempool_put() is expected to always be able to enqueue element to the ring.
> TBH, I am not sure what you are trying to prove with the litmus test.
With a litmus test, we can prove that a specific situation can or cannot occur when
executing the specified memory accesses in multiple threads. By experimenting with
different memory access sequences and orderings, we can find the most relaxed
sequence that still prohibits undesired results.
So with a litmus test, I could prove that enqueue in thread 2 above will succeed or
alternatively, may not succeed (which is undesired). And I could find out what orderings
are required to guarantee success.
> Looking at the changes you proposed:
>
>
> + /*
> + * Ensure the entries calculation was not based on a stale
> + * and unsafe stail observation that causes underflow.
> + */
> + if ((int)*entries < 0)
> + *entries = 0;
> +
>
>
> /* check that we have enough room in ring */
> if (unlikely(n > *entries))
> n = (behavior == RTE_RING_QUEUE_FIXED) ?
> 0 : *entries;
>
>
> *new_head = *old_head + n;
> if (n == 0)
> return 0;
>
>
> It is clear that with these changes enqueue/dequeue might fail even
> when there are available entries in the ring.
Without explicit synchronization between the threads, they will still race and
a consumer cannot be guaranteed to succeed with its dequeue operation, e.g.
the dequeue operation could occur before the (supposedly) matching enqueue
operation in another thread.
Using acquire/release for all ring buffer metadata accesses doesn't inform the
thread that its operation is guaranteed to succeed, it just ensures any data is
transferred (or passed ownership) in a safe way (establishes a happens-before
relation). But the ring buffer implementation itself cannot ensure that the
enqueue in thread P happens before the dequeue in thread C. The dequeue
can still fail and return 0 elements.
How do you define "there are available entries in the ring"? Just reading one
metadata variable doesn't tell you this. And I assume users usually don't access
internal ring buffer metadata. So how does a thread know there are available
entries in the ring that can be dequeued?
It seems to me that your perspective is still very much that of sequential
consistency and total order. But even so, you need synchronization (usually
based on memory accesses but there are other ways, e.g. using system calls)
to ensure that operation D in thread 2 is only initiated after operation E in
thread 1 has completed. Otherwise, the operations will race and the outcome
is not guaranteed.
- Ola
> One simple way that probably will introduce a loop instead of 'if':
> (keep reading head and tail values till we get a valid results)
> but again I am not sure it is a good way.
> Konstantin
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
    
    
More information about the dev
mailing list