|SUCCESS| [RFC, v2, 11/11] crypto/ccp: replace use of rte_atomic64 with stdatomic
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Thu May 21 22:49:01 CEST 2026
Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/164240
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-05-21
I'll review this patch series focusing on correctness, coding style, API requirements, and potential bugs.
commit: series.patch
Author: Stephen Hemminger <stephen at networkplumber.org>
Subject: [RFC v2 01/11] eal: use intrinsics for rte_atomic on all platforms
This commit removes inline assembly implementations of rte_atomic operations,
keeping only the intrinsics-based fallback for all platforms.
> diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
> index 0a4f3f8528..292e52fade 100644
> --- a/lib/eal/include/generic/rte_atomic.h
> +++ b/lib/eal/include/generic/rte_atomic.h
[ ... ]
> @@ -209,15 +209,11 @@ rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
> * The original value at that location
> */
> static inline uint16_t
> -rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val);
> -
> -#ifdef RTE_FORCE_INTRINSICS
> -static inline uint16_t
> rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
> {
> - return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
> + return rte_atomic_exchange_explicit((volatile __rte_atomic uint16_t *)dst,
> + val, rte_memory_order_seq_cst);
Does this cast correctly align with the C11 atomics standard? The cast from
"volatile uint16_t *" to "volatile __rte_atomic uint16_t *" may lead to
undefined behavior per C11 7.17.6 if dst was not originally declared atomic.
> @@ -478,15 +478,11 @@ rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
> * The original value at that location
> */
> static inline uint32_t
> -rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val);
> -
> -#ifdef RTE_FORCE_INTRINSICS
> -static inline uint32_t
> rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
> {
> - return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
> + return rte_atomic_exchange_explicit((volatile __rte_atomic uint32_t *)dst,
> + val, rte_memory_order_seq_cst);
Same casting concern as above for atomic correctness.
> @@ -746,15 +746,11 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
> * The original value at that location
> */
> static inline uint64_t
> -rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val);
> -
> -#ifdef RTE_FORCE_INTRINSICS
> -static inline uint64_t
> rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
> {
> - return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
> + return rte_atomic_exchange_explicit((volatile __rte_atomic uint64_t *)dst,
> + val, rte_memory_order_seq_cst);
Same casting concern applies here.
---
commit: [RFC v2 02/11] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
> diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
> index 292e52fade..1b04b43cbb 100644
> --- a/lib/eal/include/generic/rte_atomic.h
> +++ b/lib/eal/include/generic/rte_atomic.h
[ ... ]
> @@ -164,6 +134,24 @@ static inline void rte_io_rmb(void);
> */
> static inline void rte_atomic_thread_fence(rte_memory_order memorder);
>
> +static __rte_always_inline void
> +rte_smp_mb(void)
> +{
> + rte_atomic_thread_fence(rte_memory_order_seq_cst);
> +}
> +
> +static __rte_always_inline void
> +rte_smp_wmb(void)
> +{
> + rte_atomic_thread_fence(rte_memory_order_release);
> +}
Does release ordering provide a full write barrier? The previous rte_smp_wmb
implementations emitted dmb ishst on ARM64. The commit message indicates dmb
ish is now used. Can this ordering change affect store-to-store visibility
between threads?
> +static __rte_always_inline void
> +rte_smp_rmb(void)
> +{
> + rte_atomic_thread_fence(rte_memory_order_acquire);
> +}
Does acquire ordering prevent speculative loads? The previous rte_smp_rmb on
ARM64 emitted dmb ishld. Does the memory_order_acquire change from ishld to
ish affect load-to-load dependencies?
---
commit: [RFC v2 03/11] ring: use C11 atomic operations for MP/SP head/tail
> diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
> index affd2d5ba7..84570fd5fc 100644
> --- a/lib/ring/rte_ring_generic_pvt.h
> +++ b/lib/ring/rte_ring_generic_pvt.h
[ ... ]
> @@ -106,12 +119,24 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
>
> *new_head = *old_head + n;
> if (is_st) {
> - d->head = *new_head;
> + rte_atomic_store_explicit(&d->head, *new_head, rte_memory_order_relaxed);
Does relaxed store ordering work for single-threaded producer/consumer? If the
ring is reconfigured or accessed by diagnostics, can this relaxed store race?
> success = 1;
> - } else
> - success = rte_atomic32_cmpset(
> - (uint32_t *)(uintptr_t)&d->head,
> - *old_head, *new_head);
> + } else {
> + /*
> + * Weak CAS: the outer do-while handles spurious
> + * failures, so we avoid the strong variant's
> + * internal retry (which on arm64 wraps the LL/SC
> + * pair in a hidden inner loop).
> + *
> + * Relaxed on both success and failure: this CAS
> + * does not publish data. Slot data visibility is
> + * provided by the acquire loads above and the
> + * release store of tail in __rte_ring_update_tail().
> + */
> + success = rte_atomic_compare_exchange_weak_explicit(
> + &d->head, old_head, *new_head,
> + rte_memory_order_relaxed, rte_memory_order_relaxed);
> + }
Does the weak CAS update *old_head on failure? The do-while loop uses *old_head
but rte_atomic_compare_exchange_weak_explicit() might not update it properly.
---
commit: [RFC v2 06/11] net/ena: replace use of rte_atomicNN
> diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
> index ea4afbc75d..e9c484456c 100644
> --- a/drivers/net/ena/ena_ethdev.c
> +++ b/drivers/net/ena/ena_ethdev.c
[ ... ]
> @@ -1249,10 +1243,7 @@ static void ena_stats_restart(struct rte_eth_dev *dev)
> {
> struct ena_adapter *adapter = dev->data->dev_private;
>
> - rte_atomic64_init(&adapter->drv_stats->ierrors);
> - rte_atomic64_init(&adapter->drv_stats->oerrors);
> - rte_atomic64_init(&adapter->drv_stats->rx_nombuf);
> - adapter->drv_stats->rx_drops = 0;
> + memset(adapter->drv_stats, 0, sizeof(struct ena_driver_stats));
Does this code race with concurrent statistics updates from datapath threads?
The previous atomic operations ensured some ordering, but memset has no
atomicity guarantees.
---
The review raises several questions about memory ordering semantics, atomic
casting correctness, and potential data races that should be addressed before
merging this refactoring.
More information about the test-report
mailing list