patch 'ring: establish safe partial order in default mode' has been queued to stable release 22.11.11

Wathsala Vithanage wathsala.vithanage at arm.com
Wed Nov 12 20:12:30 CET 2025

Previous message (by thread): patch 'ring: establish safe partial order in default mode' has been queued to stable release 22.11.11
Next message (by thread): patch 'ring: establish safe partial order in default mode' has been queued to stable release 22.11.11
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Luca,

On 11/12/25 10:53, luca.boccassi at gmail.com wrote:
> Hi,
>
> FYI, your patch has been queued to stable release 22.11.11
>
> Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
> It will be pushed if I get no objections before 11/14/25. So please
> shout if anyone has objections.
>
Looked like it needed some work, so I just posted a patch.
Let me know if you need any help with RTS and HTS patches as well.

Thanks

--wathsala


> Also note that after the patch there's a diff of the upstream commit vs the
> patch applied to the branch. This will indicate if there was any rebasing
> needed to apply to the stable branch. If there were code changes for rebasing
> (ie: not only metadata diffs), please double check that the rebase was
> correctly done.
>
> Queued patches are on a temporary branch at:
> https://github.com/bluca/dpdk-stable
>
> This queued commit can be viewed at:
> https://github.com/bluca/dpdk-stable/commit/8e64e64659fe628f6b7ce903b67a6c8d271da524
>
> Thanks.
>
> Luca Boccassi
>
> ---
>  From 8e64e64659fe628f6b7ce903b67a6c8d271da524 Mon Sep 17 00:00:00 2001
> From: Wathsala Vithanage <wathsala.vithanage at arm.com>
> Date: Tue, 11 Nov 2025 18:37:17 +0000
> Subject: [PATCH] ring: establish safe partial order in default mode
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> [ upstream commit a4ad0eba9def1d1d071da8afe5e96eb2a2e0d71f ]
>
> The function __rte_ring_headtail_move_head() assumes that the barrier
> (fence) between the load of the head and the load-acquire of the
> opposing tail guarantees the following: if a first thread reads tail
> and then writes head and a second thread reads the new value of head
> and then reads tail, then it should observe the same (or a later)
> value of tail.
>
> This assumption is incorrect under the C11 memory model. If the barrier
> (fence) is intended to establish a total ordering of ring operations,
> it fails to do so. Instead, the current implementation only enforces a
> partial ordering, which can lead to unsafe interleavings. In particular,
> some partial orders can cause underflows in free slot or available
> element computations, potentially resulting in data corruption.
>
> The issue manifests when a CPU first acts as a producer and later as a
> consumer. In this scenario, the barrier assumption may fail when another
> core takes the consumer role. A Herd7 litmus test in C11 can demonstrate
> this violation. The problem has not been widely observed so far because:
>    (a) on strong memory models (e.g., x86-64) the assumption holds, and
>    (b) on relaxed models with RCsc semantics the ordering is still strong
>        enough to prevent hazards.
> The problem becomes visible only on weaker models, when load-acquire is
> implemented with RCpc semantics (e.g. some AArch64 CPUs which support
> the LDAPR and LDAPUR instructions).
>
> Three possible solutions exist:
>    1. Strengthen ordering by upgrading release/acquire semantics to
>       sequential consistency. This requires using seq-cst for stores,
>       loads, and CAS operations. However, this approach introduces a
>       significant performance penalty on relaxed-memory architectures.
>
>    2. Establish a safe partial order by enforcing a pair-wise
>       happens-before relationship between thread of same role by changing
>       the CAS and the preceding load of the head by converting them to
>       release and acquire respectively. This approach makes the original
>       barrier assumption unnecessary and allows its removal.
>
>    3. Retain partial ordering but ensure only safe partial orders are
>       committed. This can be done by detecting underflow conditions
>       (producer < consumer) and quashing the update in such cases.
>       This approach makes the original barrier assumption unnecessary
>       and allows its removal.
>
> This patch implements solution (2) to preserve the “enqueue always
> succeeds” contract expected by dependent libraries (e.g., mempool).
> While solution (3) offers higher performance, adopting it now would
> break that assumption.
>
> Fixes: 49594a63147a9 ("ring/c11: relax ordering for load and store of the head")
>
> Signed-off-by: Wathsala Vithanage <wathsala.vithanage at arm.com>
> Signed-off-by: Ola Liljedahl <ola.liljedahl at arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> Reviewed-by: Dhruv Tripathi <dhruv.tripathi at arm.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev at huawei.com>
> Tested-by: Konstantin Ananyev <konstantin.ananyev at huawei.com>
> ---
>   lib/ring/rte_ring_c11_pvt.h | 37 +++++++++++++++++++++++++++++--------
>   1 file changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index f895950df4..5c04a001e1 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -24,6 +24,11 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
>   	if (!single)
>   		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
>   
> +	/*
> +	 * R0: Establishes a synchronizing edge with load-acquire of tail at A1.
> +	 * Ensures that memory effects by this thread on ring elements array
> +	 * is observed by a different thread of the other type.
> +	 */
>   	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
>   }
>   
> @@ -61,16 +66,23 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
>   	unsigned int max = n;
>   	int success;
>   
> -	*old_head = __atomic_load_n(&r->prod.head, __ATOMIC_RELAXED);
> +	/*
> +	 * A0: Establishes a synchronizing edge with R1.
> +	 * Ensure that this thread observes same values
> +	 * to stail observed by the thread that updated
> +	 * d->head.
> +	 * If not, an unsafe partial order may ensue.
> +	 */
> +	*old_head = __atomic_load_n(&r->prod.head, __ATOMIC_ACQUIRE);
>   	do {
>   		/* Reset n to the initial burst count */
>   		n = max;
>   
> -		/* Ensure the head is read before tail */
> -		__atomic_thread_fence(__ATOMIC_ACQUIRE);
> -
> -		/* load-acquire synchronize with store-release of ht->tail
> -		 * in update_tail.
> +		/*
> +		 * A1: Establishes a synchronizing edge with R0.
> +		 * Ensures that other thread's memory effects on
> +		 * ring elements array is observed by the time
> +		 * this thread observes its tail update.
>   		 */
>   		cons_tail = __atomic_load_n(&r->cons.tail,
>   					__ATOMIC_ACQUIRE);
> @@ -170,10 +182,19 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
>   			r->cons.head = *new_head, success = 1;
>   		else
>   			/* on failure, *old_head will be updated */
> +			/*
> +			 * R1/A2.
> +			 * R1: Establishes a synchronizing edge with A0 of a
> +			 * different thread.
> +			 * A2: Establishes a synchronizing edge with R1 of a
> +			 * different thread to observe same value for stail
> +			 * observed by that thread on CAS failure (to retry
> +			 * with an updated *old_head).
> +			 */
>   			success = __atomic_compare_exchange_n(&r->cons.head,
>   							old_head, *new_head,
> -							0, __ATOMIC_RELAXED,
> -							__ATOMIC_RELAXED);
> +							0, __ATOMIC_RELEASE,
> +							__ATOMIC_ACQUIRE);
>   	} while (unlikely(success == 0));
>   	return n;
>   }

Previous message (by thread): patch 'ring: establish safe partial order in default mode' has been queued to stable release 22.11.11
Next message (by thread): patch 'ring: establish safe partial order in default mode' has been queued to stable release 22.11.11
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the stable mailing list