On Mon, 4 Sep 2023 13:57:19 +0200 Mattias Rönnblom <hof...@lysator.liu.se> wrote:
> On 2023-09-04 11:26, Morten Brørup wrote: > > The per-lcore random state is frequently updated by their individual > > lcores, so add a cache guard to prevent CPU cache thrashing. > > > > "to prevent false sharing in case the CPU employs a next-N-lines (or > similar) hardware prefetcher" > > In my world, cache trashing and cache line contention are two different > things. > > Other than that, > Acked-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> Could the per-lcore state be thread local? Something like this: From 3df5e28a7e5589d05e1eade62a0979e84697853d Mon Sep 17 00:00:00 2001 From: Stephen Hemminger <step...@networkplumber.org> Date: Wed, 6 Sep 2023 09:22:42 -0700 Subject: [PATCH] random: use per lcore state Move the random number state into thread local storage. This has a several benefits. - no false cache sharing from cpu prefetching - fixes initialization of random state for non-DPDK threads - fixes unsafe usage of random state by non-DPDK threads The initialization of random number state is done by the lcore (lazy initialization). Signed-off-by: Stephen Hemminger <step...@networkplumber.org> --- lib/eal/common/rte_random.c | 35 +++++++++++++++++------------------ 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c index 53636331a27b..62f36038ac52 100644 --- a/lib/eal/common/rte_random.c +++ b/lib/eal/common/rte_random.c @@ -19,13 +19,14 @@ struct rte_rand_state { uint64_t z3; uint64_t z4; uint64_t z5; -} __rte_cache_aligned; + uint64_t seed; +}; -/* One instance each for every lcore id-equipped thread, and one - * additional instance to be shared by all others threads (i.e., all - * unregistered non-EAL threads). - */ -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; +/* Global random seed */ +static uint64_t rte_rand_seed; + +/* Per lcore random state. */ +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state); static uint32_t __rte_rand_lcg32(uint32_t *seed) @@ -76,16 +77,14 @@ __rte_srand_lfsr258(uint64_t seed, struct rte_rand_state *state) state->z3 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 4096UL); state->z4 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 131072UL); state->z5 = __rte_rand_lfsr258_gen_seed(&lcg_seed, 8388608UL); + + state->seed = seed; } void rte_srand(uint64_t seed) { - unsigned int lcore_id; - - /* add lcore_id to seed to avoid having the same sequence */ - for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) - __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]); + __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED); } static __rte_always_inline uint64_t @@ -119,15 +118,15 @@ __rte_rand_lfsr258(struct rte_rand_state *state) static __rte_always_inline struct rte_rand_state *__rte_rand_get_state(void) { - unsigned int idx; - - idx = rte_lcore_id(); + struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state); + uint64_t seed; - /* last instance reserved for unregistered non-EAL threads */ - if (unlikely(idx == LCORE_ID_ANY)) - idx = RTE_MAX_LCORE; + /* did seed change */ + seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED); + if (unlikely(seed != rand_state->seed)) + __rte_srand_lfsr258(seed, rand_state); - return &rand_states[idx]; + return rand_state; } uint64_t -- 2.39.2