|SUCCESS| [5/5] ring: use C11 for single thread move head
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Tue Jun 2 22:56:35 CEST 2026
Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/164824
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-06-02
I'll review this DPDK patch series focusing on correctness, coding style, API requirements, and guideline violations.
---
**Patch 1/5: ring: split single thread vs multi-thread cases**
git sha: 20260602171552.686349-2-stephen at networkplumber.org
Author: Stephen Hemminger <stephen at networkplumber.org>
Subject: [PATCH 1/5] ring: split single thread vs multi-thread cases
This patch splits the move head function into separate single-threaded
and multi-threaded variants for cleaner code organization.
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index 07b6efc416..5afc14dec9 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
[ ... ]
> @@ -72,14 +71,77 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> */
> static __rte_always_inline unsigned int
> -__rte_ring_headtail_move_head(struct rte_ring_headtail *d,
> +__rte_ring_headtail_move_head_st(struct rte_ring_headtail *d,
> const struct rte_ring_headtail *s, uint32_t capacity,
> - unsigned int is_st, unsigned int n,
> + unsigned int n,
> enum rte_ring_queue_behavior behavior,
> uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
> {
> uint32_t stail;
> - int success;
Does removing this variable declaration cause an issue? The new code no
longer uses the success variable at all in the ST path.
> + /* Single producer: only this thread writes d->head,
> + * so a relaxed load is sufficient.
> + */
> + *old_head = rte_atomic_load_explicit(&d->head, rte_memory_order_relaxed);
> +
> + /* Acquire pairs with the consumer's release-store of tail in __rte_ring_update_tail,
> + * ensuring the consumer's ring-element reads are complete before
> + * we observe the updated tail.
> + */
> + stail = rte_atomic_load_explicit(&s->tail, rte_memory_order_acquire);
These comment lines exceed 79 characters. Can these be wrapped to comply
with coding standards?
> + if (n > 0) {
> + *new_head = *old_head + n;
> + rte_atomic_store_explicit(&d->head, *new_head, rte_memory_order_relaxed);
> + }
Does this code need a memory order constraint stronger than relaxed for
the ST case? The original code had different behavior.
> @@ -120,25 +182,21 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
> return 0;
>
> *new_head = *old_head + n;
> - if (is_st) {
> - d->head = *new_head;
> - success = 1;
> - } else
> - /* on failure, *old_head is updated */
The removed comment about *old_head being updated on failure was
informative. Does removing this make the code less clear?
> diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
> index affd2d5ba7..c044b0824f 100644
> --- a/lib/ring/rte_ring_generic_pvt.h
> +++ b/lib/ring/rte_ring_generic_pvt.h
[ ... ]
> @@ -105,15 +103,70 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
> return 0;
>
> *new_head = *old_head + n;
> - if (is_st) {
> - d->head = *new_head;
> - success = 1;
> - } else
> - success = rte_atomic32_cmpset(
> - (uint32_t *)(uintptr_t)&d->head,
> - *old_head, *new_head);
> + success = rte_atomic32_cmpset(
> + (uint32_t *)(uintptr_t)&d->head,
> + *old_head, *new_head);
> } while (unlikely(success == 0));
Does this condition need updating to match the new code pattern? In the
C11 version, you changed "success == 0" to "!success".
---
**Patch 2/5: ring: use GCC builtin as alternative to rte_atomic32**
git sha: 20260602171552.686349-3-stephen at networkplumber.org
Author: Stephen Hemminger <stephen at networkplumber.org>
Subject: [PATCH 2/5] ring: use GCC builtin as alternative to rte_atomic32
This patch replaces deprecated rte_atomic32 with GCC builtin atomics for
performance reasons.
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index 5afc14dec9..8358b0f21f 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -43,7 +43,6 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> */
> rte_atomic_store_explicit(&ht->tail, new_val, rte_memory_order_release);
> }
> -
Is this blank line removal intentional? It seems unrelated to the atomic
changes.
> @@ -82,7 +81,7 @@ __rte_ring_headtail_move_head_st(struct rte_ring_headtail *d,
> /* Single producer: only this thread writes d->head,
> * so a relaxed load is sufficient.
> */
> - *old_head = rte_atomic_load_explicit(&d->head, rte_memory_order_relaxed);
> + *old_head = rte_atomic_load_explicit(&d->head, rte_memory_order_acquire);
Why does the memory order change from relaxed to acquire? The comment
states "relaxed load is sufficient". Does this comment need updating, or
is the code change incorrect?
Also, there's a tab character before rte_memory_order_acquire. Does this
violate whitespace standards?
> diff --git a/lib/ring/rte_ring_gcc_pvt.h b/lib/ring/rte_ring_gcc_pvt.h
> new file mode 100644
> index 0000000000..9033a15647
> --- /dev/null
> +++ b/lib/ring/rte_ring_gcc_pvt.h
[ ... ]
> @@ -37,7 +35,12 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail, old_val,
> rte_memory_order_relaxed);
>
> - ht->tail = new_val;
> + /*
> + * R0: Establishes a synchronizing edge with load-acquire of tail at A1.
> + * Ensures that memory effects by this thread on ring elements array
> + * is observed by a different thread of the other type.
> + */
> + __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
Comment line exceeds 79 characters. Can this be wrapped?
> @@ -81,10 +84,10 @@ __rte_ring_headtail_move_head_mt(struct rte_ring_headtail *d,
>
> *old_head = d->head;
>
> - /* add rmb barrier to avoid load/load reorder in weak
> + /* add fence to avoid load/load reorder in weak
> * memory model. It is noop on x86
> */
> - rte_smp_rmb();
> + __atomic_thread_fence(__ATOMIC_ACQUIRE);
Does __atomic_thread_fence provide the same guarantees as rte_smp_rmb
on all architectures? The commit message doesn't explain this change.
> @@ -100,13 +103,15 @@ __rte_ring_headtail_move_head_mt(struct rte_ring_headtail *d,
> 0 : *entries;
>
> if (n == 0)
> - return 0;
> + break;
Why change from return to break? Does this affect cleanup or future
code paths?
> + success = __sync_bool_compare_and_swap(
> (uint32_t *)(uintptr_t)&d->head,
> *old_head, *new_head);
> - } while (unlikely(success == 0));
> + } while (unlikely(!success));
Does this maintain the same semantics? The commit message mentions
__sync_bool_compare_and_swap returns a plain bool.
---
**Patch 3/5: ring: use C11 for update_tail**
git sha: 20260602171552.686349-4-stephen at networkplumber.org
Author: Stephen Hemminger <stephen at networkplumber.org>
Subject: [PATCH 3/5] ring: use C11 for update_tail
This consolidates update_tail to use only C11 memory model.
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index 8358b0f21f..3258829696 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -19,30 +19,6 @@
> * For more information please refer to <rte_ring.h>.
> */
>
> -/**
> - * @internal This function updates tail values.
> - */
> -static __rte_always_inline void
> -__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> - uint32_t new_val, uint32_t single, uint32_t enqueue)
Does removing this duplicate definition of __rte_ring_update_tail cause
any issues with the build or inlining behavior?
---
**Patch 4/5: ring: drop unused arg to update_tail**
git sha: 20260602171552.686349-5-stephen at networkplumber.org
Author: Stephen Hemminger <stephen at networkplumber.org>
Subject: [PATCH 4/5] ring: drop unused arg to update_tail
This removes the unused enqueue parameter.
> diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h
> index a7ff76931b..74b5fef771 100644
> --- a/lib/ring/rte_ring_elem_pvt.h
> +++ b/lib/ring/rte_ring_elem_pvt.h
> @@ -301,10 +301,8 @@ __rte_ring_dequeue_elems(struct rte_ring *r, uint32_t cons_head,
>
> static __rte_always_inline void
> __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> - uint32_t new_val, uint32_t single, uint32_t enqueue)
> + uint32_t new_val, uint32_t single)
The alignment of parameters changed. Does this match the project's
preferred indentation style?
---
**Patch 5/5: ring: use C11 for single thread move head**
git sha: 20260602171552.686349-6-stephen at networkplumber.org
Author: Stephen Hemminger <stephen at networkplumber.org>
Subject: [PATCH 5/5] ring: use C11 for single thread move head
This consolidates to use C11 for ST path.
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index 3258829696..0ba64379fa 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -19,68 +19,6 @@
> * For more information please refer to <rte_ring.h>.
> */
>
> -/**
> - * @internal This is a helper function that moves the producer/consumer head
Does removing the entire __rte_ring_headtail_move_head_st function from
this file cause duplication issues when it's added to
rte_ring_elem_pvt.h?
> diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h
> index 74b5fef771..cd77343f38 100644
> --- a/lib/ring/rte_ring_elem_pvt.h
> +++ b/lib/ring/rte_ring_elem_pvt.h
> @@ -319,12 +319,74 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> rte_atomic_store_explicit(&ht->tail, new_val, rte_memory_order_release);
> }
>
> -/* Between load and load. there might be cpu reorder in weak model
> - * (powerpc/arm).
> - * There are 2 choices for the users
> - * 1.use rmb() memory barrier
> - * 2.use one-direction load_acquire/store_release barrier
> - * It depends on performance test results.
Does removing this explanatory comment reduce code maintainability? The
comment provided context about architectural considerations.
> + /* Acquire pairs with the consumer's release-store of tail in __rte_ring_update_tail,
> + * ensuring the consumer's ring-element reads are complete before
> + * we observe the updated tail.
> + */
This comment line exceeds 79 characters. Can it be wrapped?
> +/*
> + * The function __rte_ring_headtail_move_head_mt has two versions
> + * based on what is most efficient on a given architecture.
> + *
> + * The C11 is preferred but on x86 GCC has 10% performance drop.
> + */
Does this comment accurately describe the current state? Is the
performance impact documented elsewhere?
More information about the test-report
mailing list