On 2022-12-05 11:58, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
>> Sent: Monday, 5 December 2022 11.04
>>
>> Prior to this change, unregistered non-EAL threads shared a PRNG
>> instance with the main lcore. The main lcore may well be used for fast
>> path processing, potentially making rte_rand() calls in the
>> process. It should not need to synchronize with control threads.
>>
>> With this change, all unregistered non-EAL threads share one dedicated
>> PRNG instance.
>>
>> The API documentation is updated to use the proper terminology when
>> referring to threads equipped with an lcore id.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
>> ---
>>   lib/eal/common/rte_random.c  | 17 +++++++++++------
>>   lib/eal/include/rte_random.h | 10 +++++++---
>>   2 files changed, 18 insertions(+), 9 deletions(-)
>>
>> diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
>> index 166b0d8921..565f2401ce 100644
>> --- a/lib/eal/common/rte_random.c
>> +++ b/lib/eal/common/rte_random.c
>> @@ -20,7 +20,11 @@ struct rte_rand_state {
>>      uint64_t z5;
>>   } __rte_cache_aligned;
>>
>> -static struct rte_rand_state rand_states[RTE_MAX_LCORE];
>> +/* One instance each for every lcore id-equipped thread, and one
>> + * additional instance to be shared by all others threads (i.e., all
>> + * unregistered non-EAL threads).
>> + */
>> +static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1];
>>
>>   static uint32_t
>>   __rte_rand_lcg32(uint32_t *seed)
>> @@ -114,14 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state)
>>   static __rte_always_inline
>>   struct rte_rand_state *__rte_rand_get_state(void)
>>   {
>> -    unsigned int lcore_id;
>> +    unsigned int idx;
>>
>> -    lcore_id = rte_lcore_id();
>> +    idx = rte_lcore_id();
>>
>> -    if (unlikely(lcore_id == LCORE_ID_ANY))
>> -            lcore_id = rte_get_main_lcore();
>> +    /* last instance reserved for unregistered non-EAL threads */
>> +    if (unlikely(idx == LCORE_ID_ANY))
>> +            idx = RTE_MAX_LCORE;
>>
>> -    return &rand_states[lcore_id];
>> +    return &rand_states[idx];
>>   }
>>
>>   uint64_t
>> diff --git a/lib/eal/include/rte_random.h
>> b/lib/eal/include/rte_random.h
>> index d90e4d2192..2edf5d210b 100644
>> --- a/lib/eal/include/rte_random.h
>> +++ b/lib/eal/include/rte_random.h
>> @@ -41,7 +41,8 @@ rte_srand(uint64_t seedval);
>>    *
>>    * The generator is not cryptographically secure.
>>    *
>> - * If called from lcore threads, this function is thread-safe.
>> + * If called from EAL threads or registered non-EAL threads, this
>> function
>> + * is thread-safe.
>>    *
>>    * @return
>>    *   A pseudo-random value between 0 and (1<<64)-1.
>> @@ -55,7 +56,8 @@ rte_rand(void);
>>    * This function returns an uniformly distributed (unbiased) random
>>    * number less than a user-specified maximum value.
>>    *
>> - * If called from lcore threads, this function is thread-safe.
>> + * If called from EAL threads or registered non-EAL threads, this
>> function
>> + * is thread-safe.
>>    *
>>    * @param upper_bound
>>    *   The upper bound of the generated number.
>> @@ -75,7 +77,9 @@ rte_rand_max(uint64_t upper_bound);
>>    * number uniformly distributed over the interval [0.0, 1.0).
>>    *
>>    * The generator is not cryptographically secure.
>> - * If called from lcore threads, this function is thread-safe.
>> + *
>> + * If called from EAL threads or registered non-EAL threads, this
>> function
>> + * is thread-safe.
>>    *
>>    * @return
>>    *   A pseudo-random value between 0 and 1.0.
>> --
>> 2.34.1
>>
> 
> A nice improvement.
> 
> Acked-by: Morten Brørup <m...@smartsharesystems.com>
> 
> 

Thanks Morten.

> Here's some serious feature creep...
> 
> Instead of using "static struct rte_rand_state rand_states[RTE_MAX_LCORE + 
> 1];", we could use thread local storage ("__tread rte_rand_state 
> rand_state;") to keep the state per O/S thread (independent of lcore_id 
> etc.), making it completely thread safe.
> 
> But then, how do we seed the state?
> 
> Currently, we use the RTE_INIT() constructor attribute to seed the array of 
> rand_states; but there is no thread constructor attribute. So here comes the 
> feature creep:
> 
> It would be very useful with RTE_THREAD_INIT()/_FINI constructor/destructor 
> macros, so libraries and applications could define functions to be called by 
> thread_func_wrapper() before/after calling tread_func.
> 
> Using arrays like some_variable[RTE_MAX_LCORE (+ 1)] is common practice in 
> DPDK, but only really required for variables that are not private to the 
> thread, i.e. variables that other threads need access to.
> 
> Per-thread constructors/destructors is a generic feature suggestion, so 
> please don't hold back this rte_random patch!
> 

The performance (CPU & memory) implications of using TLS for the whole 
per-thread data structure (a PRNG in this case), as opposed to the DPDK 
pattern of keeping just an per-thread index in TLS and the rest in an 
instance of a static array, is very unclear to me.

A middle ground would be to keep only a pointer in TLS, and have a lazy 
allocation of an instance, when needed. I think you could solve the 
seeding issue by having a lock-protected LCG for the purpose of seeding 
(only).

For rte_random.c this is hair splitting, but considering this is a 
general pattern, I think the discussion is relevant.

> -Morten
> 

Reply via email to