[PATCH v17 0/2] net: optimize __rte_raw_cksum
scott.k.mitch1 at gmail.com
scott.k.mitch1 at gmail.com
Wed Jan 28 19:05:14 CET 2026
From: Scott <scott.k.mitch1 at gmail.com>
This series optimizes __rte_raw_cksum by replacing memcpy with direct
pointer access, enabling compiler vectorization on both GCC and Clang.
Patch 1 adds __rte_may_alias and __rte_aligned(1) to unaligned typedefs
to prevent a GCC strict-aliasing bug where struct initialization is
incorrectly elided, and avoid UB by clarifying access can be from any
address.
Patch 2 uses the improved unaligned_uint16_t type in __rte_raw_cksum
to enable compiler optimizations while maintaining correctness across
all architectures (including strict-alignment platforms).
Performance results show significant improvements (40% for small buffers,
up to 8x for larger buffers) on Intel Xeon with Clang 18.1.
Changes in v17:
- Use __rte_aligned(1) unconditionally on unaligned type aliases
- test_cksum_fuzz uses unit_test_suite_runner
- test_cksum_fuzz reference method rename to
test_cksum_fuzz_cksum_reference
Changes in v16:
- Add Fixes tag and Cc stable/author for backporting (patch 1)
Changes in v15:
- Use NOHUGE_OK and ASAN_OK constants in REGISTER_FAST_TEST
Changes in v14:
- Split into two patches: EAL typedef fix and checksum optimization
- Use unaligned_uint16_t directly instead of wrapper struct
- Added __rte_may_alias to unaligned typedefs to prevent GCC bug
Scott Mitchell (2):
eal: add __rte_may_alias and __rte_aligned to unaligned typedefs
net: __rte_raw_cksum pointers enable compiler optimizations
app/test/meson.build | 1 +
app/test/test_cksum_fuzz.c | 234 +++++++++++++++++++++++++++++++++++
app/test/test_cksum_perf.c | 2 +-
lib/eal/include/rte_common.h | 39 +++---
lib/net/rte_cksum.h | 14 +--
5 files changed, 264 insertions(+), 26 deletions(-)
create mode 100644 app/test/test_cksum_fuzz.c
--
2.39.5 (Apple Git-154)
More information about the dev
mailing list