[RFC] random: use per lcore state

Konstantin Ananyev konstantin.v.ananyev at yandex.ru
Sat Sep 9 02:13:37 CEST 2023


06/09/2023 21:02, Mattias Rönnblom пишет:
> On 2023-09-06 19:20, Stephen Hemminger wrote:
>> Move the random number state into thread local storage.
> 
> Me and Morten discussed TLS versus other alternatives in some other 
> thread. The downside of TLS that Morten pointed out, from what I recall, 
> is that lazy initialization is *required* (since the number of threads 
> is open-ended), and the data ends up in non-huge page memory.

Hmm.. correct me if I am wrong, but with current implementation,
rand state is also in non-huge memory:
static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1];


> It was 
> also unclear to me what the memory footprint implications would be,h 
> would large per-lcore data structures be put in TLS. More specifically, 
> if they would be duplicated across all threads, even non-lcore threads.
> 
> None of these issues affect rte_random.c's potential usage of TLS 
> (except lazy [re-]initialization makes things more complicated).
> 
> Preferably, there should be one pattern that is usable across all or at 
> least most DPDK modules requiring per-lcore state.
> 
>> This has a several benefits.
>>   - no false cache sharing from cpu prefetching
>>   - fixes initialization of random state for non-DPDK threads
> 
> This seems like a non-reason to me. That bug is easily fixed, if it 
> isn't already.
> 
>>   - fixes unsafe usage of random state by non-DPDK threads
>>
> 
> "Makes random number generation MT safe from all threads (including 
> unregistered non-EAL threads)."
> 
> With current API semantics you may still register an non-EAL thread, to 
> get MT safe access to this API, so I guess it's more about being more 
> convenient and less error prone, than anything else.

I understand that we never guaranteed MT safety for non-EAL threads here,
but as a user of rte_rand() - it would be much more convenient, if I can 
use it
from any thread wthout worring is it a EAL thread or not.

About TlS usage and re-seeding - can we use some sort of middle-ground:
extend rte_rand_state with some gen-counter.
Make a 'master' copy of rte_rand_state that will be updated by rte_srand(),
and TLS copies of rte_rand_state, so rte_rand() can fist compare
its gen-counter value with master copy to decide,
does it need to copy new state from master or not.


> The new MT safety guarantees should be in the API docs as well.

Yes, it is an extension to the current API, not a fix.

> 
>> The initialization of random number state is done by the
>> lcore (lazy initialization).
>>
>> Signed-off-by: Stephen Hemminger <stephen at networkplumber.org>
>> ---
>>   lib/eal/common/rte_random.c | 38 +++++++++++++++++++------------------
>>   1 file changed, 20 insertions(+), 18 deletions(-)
>>
>> diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
>> index 53636331a27b..9657adf6ad3b 100644
>> --- a/lib/eal/common/rte_random.c
>> +++ b/lib/eal/common/rte_random.c
>> @@ -19,13 +19,14 @@ struct rte_rand_state {
>>       uint64_t z3;
>>       uint64_t z4;
>>       uint64_t z5;
>> -} __rte_cache_aligned;
>> +    uint64_t seed;
>> +};
>> -/* One instance each for every lcore id-equipped thread, and one
>> - * additional instance to be shared by all others threads (i.e., all
>> - * unregistered non-EAL threads).
>> - */
>> -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1];
>> +/* Global random seed */
>> +static uint64_t rte_rand_seed;
>> +
>> +/* Per lcore random state. */
>> +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state);
>>   static uint32_t
>>   __rte_rand_lcg32(uint32_t *seed)
>> @@ -81,11 +82,7 @@ __rte_srand_lfsr258(uint64_t seed, struct 
>> rte_rand_state *state)
>>   void
>>   rte_srand(uint64_t seed)
>>   {
>> -    unsigned int lcore_id;
>> -
>> -    /* add lcore_id to seed to avoid having the same sequence */
>> -    for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
>> -        __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]);
>> +    __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED);
>>   }
>>   static __rte_always_inline uint64_t
>> @@ -119,15 +116,18 @@ __rte_rand_lfsr258(struct rte_rand_state *state)
>>   static __rte_always_inline
>>   struct rte_rand_state *__rte_rand_get_state(void)
>>   {
>> -    unsigned int idx;
>> +    struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state);
> 
> There should really be a RTE_PER_THREAD, an alias to RTE_PER_LCORE, to 
> cover this usage. Or just use __thread (or _Thread_local?).
> 
>> +    uint64_t seed;
>> -    idx = rte_lcore_id();
>> +    seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED);
>> +    if (unlikely(seed != rand_state->seed)) {
>> +        rand_state->seed = seed;
> 
> Re-seeding should restart the series, on all lcores. There's nothing 
> preventing the user from re-seeding the machinery repeatedly, with the 
> same seed. Seems like an unusual, but still valid, use case, if you run 
> repeated tests of some sort.
> 
> Use a seqlock? :) I guess you need a seed generation number as well 
> (e.g., is this the first time you seed with X, or the second one, etc.)
> 
>> -    /* last instance reserved for unregistered non-EAL threads */
>> -    if (unlikely(idx == LCORE_ID_ANY))
>> -        idx = RTE_MAX_LCORE;
>> +        seed += rte_thread_self().opaque_id;
>> +        __rte_srand_lfsr258(seed, rand_state);
>> +    }
>> -    return &rand_states[idx];
>> +    return rand_state;
>>   }
>>   uint64_t
>> @@ -227,7 +227,9 @@ RTE_INIT(rte_rand_init)
>>   {
>>       uint64_t seed;
>> -    seed = __rte_random_initial_seed();
>> +    do
>> +        seed = __rte_random_initial_seed();
>> +    while (seed == 0);
> 
> Might be worth a comment why seed 0 is not allowed. Alternatively, use 
> some other way of signaling __rte_srand_lfsr258() must be called.
> 
>>       rte_srand(seed);
>>   }



More information about the dev mailing list