[PATCH 1/1] ring: safe partial ordering for head/tail update

Konstantin Ananyev konstantin.ananyev at huawei.com
Wed Sep 17 09:58:00 CEST 2025


To avoid information loss I combined reply to two Wathsala replies into one.

> > > The function __rte_ring_headtail_move_head() assumes that the
> > > barrier
> > (fence) between the load of the head and the load-acquire of the
> > > opposing tail guarantees the following: if a first thread reads
> > > tail
> > > and then writes head and a second thread reads the new value of
> > > head
> > > and then reads tail, then it should observe the same (or a later)
> > > value of tail.
> > >
> > > This assumption is incorrect under the C11 memory model. If the
> > > barrier
> > > (fence) is intended to establish a total ordering of ring
> > > operations,
> > > it fails to do so. Instead, the current implementation only
> > > enforces a
> > > partial ordering, which can lead to unsafe interleavings. In
> > > particular,
> > > some partial orders can cause underflows in free slot or available
> > > element computations, potentially resulting in data corruption.
> >
> > Hmm... sounds exactly like the problem from the patch we discussed
> > earlier that year:
> > https://patchwork.dpdk.org/project/dpdk/patch/20250521111432.207936-4-konstantin.ananyev@huawei.com/
> > In two words:
> > "... thread can see 'latest' 'cons.head' value, with 'previous' value
> > for 'prod.tail' or visa-versa.
> > In other words: 'cons.head' value depends on 'prod.tail', so before
> > making latest 'cons.head'
> > value visible to other threads, we need to ensure that latest
> > 'prod.tail' is also visible."
> > Is that the one?

> Yes, the behavior occurs under RCpc (LDAPR) but not under RCsc (LDAR),
> which is why we didn’t catch it earlier. A fuller explanation, with
> Herd7 simulations, is in the blog post linked in the cover letter.
>
> https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/when-a-barrier-does-not-block-the-pitfalls-of-partial-order

I see, so now it is reproducible with core rte_ring on real HW.

> >
> > > The issue manifests when a CPU first acts as a producer and later
> > > as a
> > > consumer. In this scenario, the barrier assumption may fail when
> > > another
> > > core takes the consumer role. A Herd7 litmus test in C11 can
> > > demonstrate
> > > this violation. The problem has not been widely observed so far
> > > because:
> > >   (a) on strong memory models (e.g., x86-64) the assumption holds,
> > > and
> > >   (b) on relaxed models with RCsc semantics the ordering is still
> > > strong
> > >       enough to prevent hazards.
> > > The problem becomes visible only on weaker models, when load-
> > > acquire is
> > > implemented with RCpc semantics (e.g. some AArch64 CPUs which
> > > support
> > > the LDAPR and LDAPUR instructions).
> > >
> > > Three possible solutions exist:
> > >   1. Strengthen ordering by upgrading release/acquire semantics to
> > >      sequential consistency. This requires using seq-cst for
> > > stores,
> > >      loads, and CAS operations. However, this approach introduces a
> > >      significant performance penalty on relaxed-memory
> > > architectures.
> > >
> > >   2. Establish a safe partial order by enforcing a pair-wise
> > >      happens-before relationship between thread of same role by
> > > changing
> > >      the CAS and the preceding load of the head by converting them
> > > to
> > >      release and acquire respectively. This approach makes the
> > > original
> > >      barrier assumption unnecessary and allows its removal.
> >
> > For the sake of clarity, can you outline what would be exact code
> > changes for
> > approach #2? Same as in that patch:
> > https://patchwork.dpdk.org/project/dpdk/patch/20250521111432.207936-4-
> konstantin.ananyev at huawei.com/
> > Or something different?
> 
> Sorry, I missed the later half you your comment before.
> Yes, you have proposed the same solution there.

Ok,  thanks for confirmation.

> >
> >
> > >   3. Retain partial ordering but ensure only safe partial orders
> > > are
> > >      committed. This can be done by detecting underflow conditions
> > >      (producer < consumer) and quashing the update in such cases.
> > >      This approach makes the original barrier assumption
> > > unnecessary
> > >      and allows its removal.
> >
> > > This patch implements solution (3) for performance reasons.
> > >
> > > Signed-off-by: Wathsala Vithanage <wathsala.vithanage at arm.com>
> > > Signed-off-by: Ola Liljedahl <ola.liljedahl at arm.com>
> > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> > > Reviewed-by: Dhruv Tripathi <dhruv.tripathi at arm.com>
> > > ---
> > >  lib/ring/rte_ring_c11_pvt.h | 10 +++++++---
> > >  1 file changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/lib/ring/rte_ring_c11_pvt.h
> > > b/lib/ring/rte_ring_c11_pvt.h
> > > index b9388af0da..e5ac1f6b9e 100644
> > > --- a/lib/ring/rte_ring_c11_pvt.h
> > > +++ b/lib/ring/rte_ring_c11_pvt.h
> > > @@ -83,9 +83,6 @@ __rte_ring_headtail_move_head(struct
> > > rte_ring_headtail
> > > *d,
> > >             /* Reset n to the initial burst count */
> > >             n = max;
> > >
> > > -           /* Ensure the head is read before tail */
> > > -           rte_atomic_thread_fence(rte_memory_order_acquire);
> > > -
> > >             /* load-acquire synchronize with store-release of
> > > ht->tail
> > >              * in update_tail.
> > >              */
> >
> > But then cons.head can be read a before prod.tail (and visa-versa),
> > right?
> 
> Right, we let it happen but eliminate any resulting states that are
> semantically incorrect at the end.

Two comments here:
1) I think it is probably  safer to do the check like that: 
If (*entries > ring->capacity) ...
2) My concern that without forcing a proper read ordering
(cons.head first then prod.tail) we re-introduce a window for all sorts of
ABA-like problems.
ring: guarantee load/load order in enqueue and dequeue
commit 9bc2cbb007c0a3335c5582357ae9f6d37ea0b654
Author: Jia He <justin.he at arm.com>
Date:   Fri Nov 10 03:30:42 2017 +0000

> >
> > > @@ -99,6 +96,13 @@ __rte_ring_headtail_move_head(struct
> > > rte_ring_headtail
> > > *d,
> > >              */
> > >             *entries = (capacity + stail - *old_head);
> > >
> > > +           /*
> > > +            * Ensure the entries calculation was not based on
> > > a stale
> > > +            * and unsafe stail observation that causes
> > > underflow.
> > > +            */
> > > +           if ((int)*entries < 0)
> > > +                   *entries = 0;
> > > +
> > >             /* check that we have enough room in ring */
> > >             if (unlikely(n > *entries))
> > >                     n = (behavior == RTE_RING_QUEUE_FIXED) ?
> > > --
> > > 2.43.0
> > >
> >
> 
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended recipient, please
> notify the sender immediately and do not disclose the contents to any other person,
> use it for any purpose, or store or copy the information in any medium. Thank you.


More information about the dev mailing list