[PATCH 1/1] ring: safe partial ordering for head/tail update
Konstantin Ananyev
konstantin.ananyev at huawei.com
Wed Sep 24 17:03:35 CEST 2025
> > > > > > > Sure, I am talking about MT scenario.
> > > > > > > I think I already provided an example: DPDK mempool library (see below).
> > > > > > > In brief, It works like that:
> > > > > > > At init it allocates ring of N memory buffers and ring big enough to hold all
> of
> > > > > them.
> > > > > >
> > > > > > Sorry, I meant to say: "it allocates N memory buffers and ring big enough to
> > > hold
> > > > > all of them".
> > > > > >
> > > > > > > Then it enqueues all allocated memory buffers into the ring.
> > > > > > > mempool_get - retrieves (dequeues) buffers from the ring.
> > > > > > > mempool_put - puts them back (enqueues) to the ring
> > > > > > > get() might fail (ENOMEM), while put is expected to always succeed.
> > > > > But how does the thread which calls mempool_put() get hold of the memory
> > > buffers
> > > > > that
> > > > > were obtained using mempool_get() by some other thread? Or this is not the
> > > > > scenario you
> > > > > are worrying about?
> > > > > Is it rather that multiple threads independently call mempool_get() and then
> > > > > mempool_put()
> > > > > on their own buffers? And you are worried that a thread will fail to return
> > > > > (mempool_put) a
> > > > > buffer that it earlier allocated (mempool_get)? We could create a litmus test
> for
> > > > > that.
> > > >
> > > >
> > > > Both scenarios are possible.
> > > > For Run-To-Completion model each thread usually does: allocate/use/free
> group
> > > of mbufs.
> > > > For pipleline model one thread can allocate bunch of mbufs, then pass them to
> > > other
> > > thread (via another ring for example) for further processing and then releasing.
> > > In the pipeline model, if the last stage (thread) frees (enqueues) buffers onto
> some
> > > ring buffer
> > > and the first stage (thread) allocates (dequeues) buffers from the same ring
> buffer
> > > but there
> > > isn't any other type of synchronization between the threads, we can never
> guarantee
> > > that
> > > the first thread will be able to dequeue buffers because it doesn't know whether
> the
> > > last
> > > thread has enqueued any buffers.
> >
> >
> > Yes, as I said above - for mempool use-case: dequeue can fail, enqueue should
> always succeed.
> > The closest analogy: malloc() can fail, free() should never fail.
> >
> >
> > >
> > > However, enqueue ought to always succeed. We should be able to create a
> litmus
> > > test for that.
> > > Ring 1 is used as mempool, it initially contains capacity elements (full).
> > > Ring 2 is used as pipe between stages 1 and 2, it initially contains 0 elements
> (empty).
> > > Thread 1 allocates/dequeues a buffer from ring 1.
> > > Thread 1 enqueues that buffer onto ring 2.
> > > Thread 2 dequeues a buffer from ring 2.
> > > Thread 2 frees/enqueues that buffer onto ring 1. <<< this must succeed!
> > > Does this reflect the situation you worry about?
> >
> >
> > This is one of the possible scenarios.
> > As I said above - mempool_put() is expected to always be able to enqueue element
> to the ring.
> > TBH, I am not sure what you are trying to prove with the litmus test.
> With a litmus test, we can prove that a specific situation can or cannot occur when
> executing the specified memory accesses in multiple threads. By experimenting with
> different memory access sequences and orderings, we can find the most relaxed
> sequence that still prohibits undesired results.
>
> So with a litmus test, I could prove that enqueue in thread 2 above will succeed or
> alternatively, may not succeed (which is undesired). And I could find out what
> orderings
> are required to guarantee success.
I understand that part, what I am trying to say: this is just one of possible scenarios.
We could have multiple threads, multiple allocators, and multiple releasers,
and/or threads that will perform both.
Not to mention that this is just one library (mempool), there probably some others
Within DPDK that have similar expectations, not to mention third-party code.
I am not trying to stop you from run whatever tests you like,
I am just a bit skeptical that it will cover all possible scenarios.
> > Looking at the changes you proposed:
> >
> >
> > + /*
> > + * Ensure the entries calculation was not based on a stale
> > + * and unsafe stail observation that causes underflow.
> > + */
> > + if ((int)*entries < 0)
> > + *entries = 0;
> > +
> >
> >
> > /* check that we have enough room in ring */
> > if (unlikely(n > *entries))
> > n = (behavior == RTE_RING_QUEUE_FIXED) ?
> > 0 : *entries;
> >
> >
> > *new_head = *old_head + n;
> > if (n == 0)
> > return 0;
> >
> >
> > It is clear that with these changes enqueue/dequeue might fail even
> > when there are available entries in the ring.
> Without explicit synchronization between the threads, they will still race and
> a consumer cannot be guaranteed to succeed with its dequeue operation, e.g.
> the dequeue operation could occur before the (supposedly) matching enqueue
> operation in another thread.
>
> Using acquire/release for all ring buffer metadata accesses doesn't inform the
> thread that its operation is guaranteed to succeed, it just ensures any data is
> transferred (or passed ownership) in a safe way (establishes a happens-before
> relation). But the ring buffer implementation itself cannot ensure that the
> enqueue in thread P happens before the dequeue in thread C. The dequeue
> can still fail and return 0 elements.
>
> How do you define "there are available entries in the ring"?
> Just reading one
> metadata variable doesn't tell you this. And I assume users usually don't access
> internal ring buffer metadata. So how does a thread know there are available
> entries in the ring that can be dequeued?
As I said - dequeue can fail, that's ok.
enqueue never should.
As I explained, that is dictated by mempool lib design.
We create a ring with capacity=N, then put N objects into it.
Then we just dequeue/enqueue these elements from/to the ring.
We never try to put any other elements into the ring, so there always
has to be enough space in the ring to return object back to it.
> It seems to me that your perspective is still very much that of sequential
> consistency and total order. But even so, you need synchronization (usually
> based on memory accesses but there are other ways, e.g. using system calls)
> to ensure that operation D in thread 2 is only initiated after operation E in
> thread 1 has completed. Otherwise, the operations will race and the outcome
> is not guaranteed.
>
> - Ola
>
> > One simple way that probably will introduce a loop instead of 'if':
> > (keep reading head and tail values till we get a valid results)
> > but again I am not sure it is a good way.
> > Konstantin
>
>
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended recipient, please
> notify the sender immediately and do not disclose the contents to any other person,
> use it for any purpose, or store or copy the information in any medium. Thank you.
More information about the dev
mailing list