[PATCH 0/6] test: fix sporadic failures on high core count systems
Stephen Hemminger
stephen at networkplumber.org
Sun Jan 18 21:09:07 CET 2026
This series addresses several test failures that occur sporadically on
systems with many cores (32+), particularly on AMD Zen architectures.
I think Ferruh may have addressed similar problems in earlier
releases.
The root causes fall into three categories:
1. Missing rte_pause() in synchronization spinloops (patch 1)
Tight spinloops without pause cause SMT thread starvation and
unpredictable timing behavior.
2. Fixed iteration counts that don't scale (patch 2)
The atomic test performs 1M iterations per worker regardless of
core count. With 32+ cores, contention causes timeout failures.
Bugzilla ID: 952
3. File-prefix collisions during parallel test execution (patches 5-6)
Multiple tests using the default "rte" prefix compete for the same
fbarray files, causing EAL initialization failures.
Additionally, two BPF-related fixes are included:
4. Race condition in BPF ELF loading (patch 3)
Missing fsync() before close() causes sporadic EINVAL failures.
5. Unsupported BPF instructions with newer clang (patch 4)
Clang 20+ generates JMP32 instructions that DPDK BPF doesn't support.
Bugzilla ID: 1844
Stephen Hemminger (6):
test: add pause to synchronization spinloops
test: fix timeout for atomic test on high core count systems
test: fix race condition in ELF load tests
test: fix unsupported BPF instructions in elf load test
test: add file-prefix for all fast-tests on Linux
test: fix trace_autotest_with_traces parallel execution
app/test/bpf/meson.build | 3 +-
app/test/suites/meson.build | 20 ++++++++---
app/test/test_atomic.c | 67 ++++++++++++++++++++++---------------
app/test/test_bpf.c | 5 ++-
4 files changed, 62 insertions(+), 33 deletions(-)
--
2.51.0
More information about the dev
mailing list