|WARNING| [2/2] app/test: add stress tests for rwlock and pflock
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Wed Jun 10 20:55:03 CEST 2026
Test-Label: ai-code-review
Test-Status: WARNING
https://dpdk.org/patch/165258
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-06-10
# DPDK Patch Review
## Patch 1/2: eal/pflock: add API to downgrade from wr to rd lock
### Errors
None identified.
### Warnings
1. **Missing experimental API marking**
The new function `rte_pflock_write_downgrade()` is a public API addition but lacks the `__rte_experimental` tag.
New API functions must be marked experimental.
```c
__rte_experimental
static inline void
rte_pflock_write_downgrade(rte_pflock_t *pf)
```
2. **Missing release notes update**
This patch adds a new public API function but does not include an update to the release notes (`doc/guides/rel_notes/`).
API additions require documentation in the release notes.
3. **Missing Doxygen for caller preconditions**
The function comment does not document the precondition that the caller must already hold a write lock.
While the text mentions "Note that caller is expected to obtain wrlock," this should be in the Doxygen `@param` or `@note` section for clarity.
```c
/**
* Release a pflock held for writing, while keeping lock for reading.
*
* @note Caller must hold the write lock before calling this function.
*
* @param pf
* A pointer to a pflock structure.
*/
```
4. **Memory ordering justification**
The first atomic operation uses `rte_memory_order_acq_rel` which is stronger than necessary for a simple increment.
Consider whether `rte_memory_order_release` suffices, or document why acquire semantics are needed when adding a reader.
5. **Missing symbol export macro**
If this function is intended to be non-inline (though it is static inline here), it would need an `RTE_EXPORT_EXPERIMENTAL_SYMBOL` macro in the `.c` file.
Since it's inline, this is acceptable, but verify the ABI policy for inline functions.
6. **Trailing blank lines**
The patch adds two blank lines at the end of the file.
Files should end with exactly one newline.
### Info
- The atomic operations appear correct for downgrading a write lock to a read lock: migrating to read phase, clearing the write bit, then releasing the write lock.
---
## Patch 2/2: app/test: add stress tests for rwlock and pflock
### Errors
1. **Use-after-free in error handling (Critical)**
In `handle_error()`, when `write_lock == false`, the function does not unlock the reader but the comment says "Don't unlock here as it's already unlocked by the calling function."
However, in `handle_reader_work()` at line 224, after detecting `s->counter != local_counter`, the code calls `handle_error()` WITHOUT unlocking first, then returns.
This leaves the read lock held.
The same issue occurs at line 215 when array verification fails.
```c
/* Bad - lock still held */
if (s->counter_array[i] != local_counter) {
handle_error(s, lcore_id, false, __func__, __LINE__);
return -1;
}
/* Good - unlock before error */
if (s->counter_array[i] != local_counter) {
s->lock.ops->read_unlock(&s->lock);
handle_error(s, lcore_id, false, __func__, __LINE__);
return -1;
}
```
2. **Incorrect write-downgrade path unlock**
In `handle_writer_work()`, the `DOWNGRADE_TEST` path calls `write_downgrade()` then `read_unlock()`.
However, if array verification fails inside the downgraded read section (line 188-191), `handle_error()` is called with `write_lock = false`, which does NOT unlock.
So the read lock is leaked when downgrade verification fails.
3. **Undefined behavior: data race on shared counter**
The `counter` and `counter_array` fields are marked `volatile`, but the code accesses them from multiple threads without atomic operations.
`volatile` does not provide atomicity or memory ordering.
These should use `rte_atomic_load_explicit()` and `rte_atomic_store_explicit()`.
```c
/* Current (wrong) */
volatile uint64_t counter;
s->counter += delta;
/* Should be atomic or document that the lock protects them */
```
Note: If the intent is that the lock always protects these fields, then `volatile` is unnecessary and should be removed.
The test appears to rely on lock protection, so `volatile` is misleading.
4. **Missing error propagation in worker function**
In `lcore_function()`, when `handle_writer_work()` or `handle_reader_work()` returns `-1`, the code does `continue` instead of breaking the loop or returning.
This means after an error is detected and `s->stop` is set, the thread keeps running until `s->stop` is observed in the loop condition, potentially performing more incorrect operations.
5. **Implicit integer narrowing in percentage calculation**
Line 136: `uint32_t readers_pct = (num_readers * 100) / (total_lcores - 1);`
If `num_readers` is close to `UINT32_MAX`, `num_readers * 100` will overflow before the division.
Use a 64-bit intermediate: `(uint64_t)num_readers * 100`.
6. **Missing bounds check on lcore_id**
Throughout the code, `s->stats[lcore_id]` is accessed without verifying `lcore_id < RTE_MAX_LCORE`.
While DPDK's `rte_lcore_id()` should never return an out-of-range value, defensive programming or an assertion would prevent potential out-of-bounds access.
### Warnings
1. **Missing release notes update**
New tests are added but no release notes entry.
While test-only changes typically don't require release notes, stress tests that validate new API (the `write_downgrade` function) should be mentioned in the context of the API's release notes entry.
2. **Inconsistent error handling between reader and writer**
The `handle_error()` function unlocks for writers but not readers, with a comment explaining readers are "already unlocked by the calling function."
This is fragile -- the calling function at line 215 and 224 does NOT unlock before calling `handle_error()`.
Refactor so `handle_error()` consistently handles unlocking based on the lock state.
3. **Missing test registration documentation**
The `REGISTER_STRESS_TEST` macro is used but not defined in the patch.
Verify this macro exists and is the correct registration mechanism for stress tests.
4. **Shared state not declared in shared memory**
`struct rwlock_test_shared` is allocated on the stack in `test_rwlock_stress_impl()` and passed to remote lcores.
Stack-allocated memory is process-private.
Multi-process stress tests would fail.
If only single-process testing is intended, this is acceptable but should be documented.
If multi-process is intended, use `rte_zmalloc()`.
5. **Test duration hardcoded**
`TEST_DURATION_SEC` is fixed at 5 seconds.
Consider making this configurable or documenting why 5 seconds is the right duration.
6. **Missing `const` on ops tables**
`pflock_ops` and `rwlock_ops` are function pointer arrays that are never modified at runtime.
They should be declared `const`:
```c
static const struct rwlock_ops pflock_ops = {
```
(Both are already `const` -- this is correct, no issue.)
7. **Function pointer `write_downgrade` can be NULL**
In `handle_writer_work()`, the code checks `if (s->lock.ops->write_downgrade)` before calling it.
However, the `rwlock_ops` table does not provide this function pointer, leaving it NULL.
This is correct, but the code should document that NULL means "downgrade not supported" and that `DOWNGRADE_TEST` should only run on locks that provide this operation.
8. **Missing header documentation**
The header `test_rwlock_stress_impl.h` lacks a high-level comment describing its purpose (generic stress test infrastructure) and usage pattern (include and define ops table).
9. **Potential integer overflow in acquire_time**
`acquire_time` accumulates the maximum wait time across all operations.
If wait times are consistently large, `uint64_t` should be sufficient, but document the units (TSC cycles).
10. **Statistics structure padding**
`struct lcore_stats` is marked `__rte_cache_aligned` to avoid false sharing.
Verify each field is accessed only by its owning lcore to prevent cross-lcore cache coherency traffic.
### Info
- The test design (shared counter, array consistency checks, reader/writer roles) is sound for stress testing lock correctness.
- The use of a generic `rwlock_ops` interface to test both `pflock` and `rwlock` is good code reuse.
- Test registration uses `REGISTER_STRESS_TEST` which appears to be a custom macro not in the standard DPDK `REGISTER_FAST_TEST` / `REGISTER_PERF_TEST` set -- ensure this is defined in the test framework.
---
## Summary
**Patch 1/2**: Missing experimental tag, release notes, and Doxygen clarifications.
**Patch 2/2**: Critical correctness bugs in error-path lock handling (locks not released before returning) and inappropriate use of `volatile` instead of atomics (or remove `volatile` if locks always protect the fields). Also missing release notes.
Both patches require updates to release notes and API tagging. Patch 2/2 requires fixing error-path unlocking and clarifying the memory model for shared counters.
More information about the test-report
mailing list