[PATCH v12 6/7] eal: add unit tests for atomic bit access functions
David Marchand
david.marchand at redhat.com
Fri Oct 11 10:35:46 CEST 2024
On Thu, Oct 10, 2024 at 1:56 PM Mattias Rönnblom <hofors at lysator.liu.se> wrote:
>
> On 2024-10-10 12:45, David Marchand wrote:
> > On Fri, Sep 20, 2024 at 12:57 PM Mattias Rönnblom
> > <mattias.ronnblom at ericsson.com> wrote:
> >> + static int \
> >> + run_parallel_test_and_modify ## size(void *arg) \
> >> + { \
> >> + struct parallel_test_and_set_lcore ## size *lcore = arg; \
> >> + uint64_t deadline = rte_get_timer_cycles() + \
> >> + PARALLEL_TEST_RUNTIME * rte_get_timer_hz(); \
> >> + do { \
> >> + bool old_value; \
> >> + bool new_value = rte_rand() & 1; \
> >> + bool use_assign = rte_rand() & 1; \
> >> + \
> >> + if (use_assign) \
> >> + old_value = rte_bit_atomic_test_and_assign( \
> >> + lcore->word, lcore->bit, new_value, \
> >> + rte_memory_order_relaxed); \
> >> + else \
> >> + old_value = new_value ? \
> >> + rte_bit_atomic_test_and_set( \
> >> + lcore->word, lcore->bit, \
> >> + rte_memory_order_relaxed) : \
> >> + rte_bit_atomic_test_and_clear( \
> >> + lcore->word, lcore->bit, \
> >> + rte_memory_order_relaxed); \
> >> + if (old_value != new_value) \
> >> + lcore->flips++; \
> >> + } while (rte_get_timer_cycles() < deadline); \
> >> + \
> >> + return 0; \
> >> + } \
> >> + \
> >> + static int \
> >> + test_bit_atomic_parallel_test_and_modify ## size(void) \
> >> + { \
> >> + unsigned int worker_lcore_id; \
> >> + uint ## size ## _t word = 0; \
> >> + unsigned int bit = rte_rand_max(size); \
> >> + struct parallel_test_and_set_lcore ## size lmain = { \
> >> + .word = &word, \
> >> + .bit = bit \
> >> + }; \
> >> + struct parallel_test_and_set_lcore ## size lworker = { \
> >> + .word = &word, \
> >> + .bit = bit \
> >> + }; \
> >> + \
> >> + if (rte_lcore_count() < 2) { \
> >> + printf("Need multiple cores to run parallel test.\n"); \
> >> + return TEST_SKIPPED; \
> >> + } \
> >> + \
> >> + worker_lcore_id = rte_get_next_lcore(-1, 1, 0); \
> >> + \
> >> + int rc = rte_eal_remote_launch(run_parallel_test_and_modify ## size, \
> >> + &lworker, worker_lcore_id); \
> >> + TEST_ASSERT(rc == 0, "Worker thread launch failed"); \
> >> + \
> >> + run_parallel_test_and_modify ## size(&lmain); \
> >> + \
> >> + rte_eal_mp_wait_lcore(); \
> >> + \
> >> + uint64_t total_flips = lmain.flips + lworker.flips; \
> >> + bool expected_value = total_flips % 2; \
> >> + \
> >> + TEST_ASSERT(expected_value == rte_bit_test(&word, bit), \
> >> + "After %"PRId64" flips, the bit value " \
> >> + "should be %d", total_flips, expected_value); \
> >> + \
> >> + uint64_t expected_word = 0; \
> >> + rte_bit_assign(&expected_word, bit, expected_value); \
> >> + \
> >> + TEST_ASSERT(expected_word == word, "Untouched bits have " \
> >> + "changed value"); \
> >> + \
> >> + return TEST_SUCCESS; \
> >> + }
> >> +
> >> +GEN_TEST_BIT_PARALLEL_TEST_AND_MODIFY(32)
> >> +GEN_TEST_BIT_PARALLEL_TEST_AND_MODIFY(64)
> >
> > It appears this test failed once in the CI for an unrelated series
> > (uAPI kernel header import):
> > https://lab.dpdk.org/results/dashboard/testruns/logs/1385993/
> >
> > + TestCase [ 0] : test_bit_access32 succeeded
> > + TestCase [ 1] : test_bit_access64 succeeded
> > + TestCase [ 2] : test_bit_access32 succeeded
> > + TestCase [ 3] : test_bit_access64 succeeded
> > + TestCase [ 4] : test_bit_v_access32 succeeded
> > + TestCase [ 5] : test_bit_v_access64 succeeded
> > + TestCase [ 6] : test_bit_atomic_access32 succeeded
> > + TestCase [ 7] : test_bit_atomic_access64 succeeded
> > + TestCase [ 8] : test_bit_atomic_v_access32 succeeded
> > + TestCase [ 9] : test_bit_atomic_v_access64 succeeded
> > + TestCase [10] : test_bit_atomic_parallel_assign32 succeeded
> > + TestCase [11] : test_bit_atomic_parallel_assign64 succeeded
> > + TestCase [12] : test_bit_atomic_parallel_test_and_modify32 failed
> > + TestCase [13] : test_bit_atomic_parallel_test_and_modify64 succeeded
> > + TestCase [14] : test_bit_atomic_parallel_flip32 succeeded
> > + TestCase [15] : test_bit_atomic_parallel_flip64 succeeded
> > + TestCase [16] : test_bit_relaxed_set succeeded
> > + TestCase [17] : test_bit_relaxed_clear succeeded
> > + TestCase [18] : test_bit_relaxed_test_set_clear succeeded
> >
> > EAL: Test assert test_bit_atomic_parallel_test_and_modify32 line 236
> > failed: After 1070523 flips, the bit value should be 1
> >
>
> I've been unable to reproduce this on my Raptor Lake x86_64 with GCC
> 12.3 (CI machine used GCC 12.2).
>
> I'll try if I have better luck on some other systems.
Could it have to do with how UNH tests in containers?
(with physical cores that may not be isolated/dedicated to a process).
This morning, I see at least two similar errors, on the same part of the test:
https://patchwork.dpdk.org/project/dpdk/patch/20241010203209.63911-1-nandinipersad361@gmail.com/
EAL: Test assert test_bit_atomic_parallel_test_and_modify32 line 236
failed: After 1719375 flips, the bit value should be 1
https://patchwork.dpdk.org/project/dpdk/patch/20241010194148.1877659-18-rjarry@redhat.com/
EAL: Test assert test_bit_atomic_parallel_test_and_modify32 line 236
failed: After 1199935 flips, the bit value should be 1
--
David Marchand
More information about the dev
mailing list