On 2023-09-06 19:20, Stephen Hemminger wrote:
Move the random number state into thread local storage.

Me and Morten discussed TLS versus other alternatives in some other thread. The downside of TLS that Morten pointed out, from what I recall, is that lazy initialization is *required* (since the number of threads is open-ended), and the data ends up in non-huge page memory. It was also unclear to me what the memory footprint implications would be, would large per-lcore data structures be put in TLS. More specifically, if they would be duplicated across all threads, even non-lcore threads.

None of these issues affect rte_random.c's potential usage of TLS (except lazy [re-]initialization makes things more complicated).

Preferably, there should be one pattern that is usable across all or at least most DPDK modules requiring per-lcore state.

This has a several benefits.
  - no false cache sharing from cpu prefetching
  - fixes initialization of random state for non-DPDK threads

This seems like a non-reason to me. That bug is easily fixed, if it isn't already.

  - fixes unsafe usage of random state by non-DPDK threads


"Makes random number generation MT safe from all threads (including unregistered non-EAL threads)."

With current API semantics you may still register an non-EAL thread, to get MT safe access to this API, so I guess it's more about being more convenient and less error prone, than anything else.

The new MT safety guarantees should be in the API docs as well.

The initialization of random number state is done by the
lcore (lazy initialization).

Signed-off-by: Stephen Hemminger <step...@networkplumber.org>
---
  lib/eal/common/rte_random.c | 38 +++++++++++++++++++------------------
  1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
index 53636331a27b..9657adf6ad3b 100644
--- a/lib/eal/common/rte_random.c
+++ b/lib/eal/common/rte_random.c
@@ -19,13 +19,14 @@ struct rte_rand_state {
        uint64_t z3;
        uint64_t z4;
        uint64_t z5;
-} __rte_cache_aligned;
+       uint64_t seed;
+};
-/* One instance each for every lcore id-equipped thread, and one
- * additional instance to be shared by all others threads (i.e., all
- * unregistered non-EAL threads).
- */
-static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1];
+/* Global random seed */
+static uint64_t rte_rand_seed;
+
+/* Per lcore random state. */
+static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state);
static uint32_t
  __rte_rand_lcg32(uint32_t *seed)
@@ -81,11 +82,7 @@ __rte_srand_lfsr258(uint64_t seed, struct rte_rand_state 
*state)
  void
  rte_srand(uint64_t seed)
  {
-       unsigned int lcore_id;
-
-       /* add lcore_id to seed to avoid having the same sequence */
-       for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
-               __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]);
+       __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED);
  }
static __rte_always_inline uint64_t
@@ -119,15 +116,18 @@ __rte_rand_lfsr258(struct rte_rand_state *state)
  static __rte_always_inline
  struct rte_rand_state *__rte_rand_get_state(void)
  {
-       unsigned int idx;
+       struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state);

There should really be a RTE_PER_THREAD, an alias to RTE_PER_LCORE, to cover this usage. Or just use __thread (or _Thread_local?).

+       uint64_t seed;
- idx = rte_lcore_id();
+       seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED);
+       if (unlikely(seed != rand_state->seed)) {
+               rand_state->seed = seed;

Re-seeding should restart the series, on all lcores. There's nothing preventing the user from re-seeding the machinery repeatedly, with the same seed. Seems like an unusual, but still valid, use case, if you run repeated tests of some sort.

Use a seqlock? :) I guess you need a seed generation number as well (e.g., is this the first time you seed with X, or the second one, etc.)

-       /* last instance reserved for unregistered non-EAL threads */
-       if (unlikely(idx == LCORE_ID_ANY))
-               idx = RTE_MAX_LCORE;
+               seed += rte_thread_self().opaque_id;
+               __rte_srand_lfsr258(seed, rand_state);
+       }
- return &rand_states[idx];
+       return rand_state;
  }
uint64_t
@@ -227,7 +227,9 @@ RTE_INIT(rte_rand_init)
  {
        uint64_t seed;
- seed = __rte_random_initial_seed();
+       do
+               seed = __rte_random_initial_seed();
+       while (seed == 0);

Might be worth a comment why seed 0 is not allowed. Alternatively, use some other way of signaling __rte_srand_lfsr258() must be called.

        rte_srand(seed);
  }

Reply via email to