> From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com] > Sent: Monday, 16 September 2024 16.02 > > > Introduce DPDK per-lcore id variables, or lcore variables for short. > > > > An lcore variable has one value for every current and future lcore > > id-equipped thread. > > > > The primary <rte_lcore_var.h> use case is for statically allocating > > small, frequently-accessed data structures, for which one instance > > should exist for each lcore. > > > > Lcore variables are similar to thread-local storage (TLS, e.g., C11 > > _Thread_local), but decoupling the values' life time with that of the > > threads. > > > > Lcore variables are also similar in terms of functionality provided by > > FreeBSD kernel's DPCPU_*() family of macros and the associated > > build-time machinery. DPCPU uses linker scripts, which effectively > > prevents the reuse of its, otherwise seemingly viable, approach. > > > > The currently-prevailing way to solve the same problem as lcore > > variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized > > array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of > > lcore variables over this approach is that data related to the same > > lcore now is close (spatially, in memory), rather than data used by > > the same module, which in turn avoid excessive use of padding, > > polluting caches with unused data. > > > > Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > > Acked-by: Morten Brørup <m...@smartsharesystems.com> > > LGTM in general, few small questions (mostly nits), see below. > > > --- /dev/null > > +++ b/lib/eal/common/eal_common_lcore_var.c > > @@ -0,0 +1,78 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(c) 2024 Ericsson AB > > + */ > > + > > +#include <inttypes.h> > > +#include <stdlib.h> > > + > > +#ifdef RTE_EXEC_ENV_WINDOWS > > +#include <malloc.h> > > +#endif > > + > > +#include <rte_common.h> > > +#include <rte_debug.h> > > +#include <rte_log.h> > > + > > +#include <rte_lcore_var.h> > > + > > +#include "eal_private.h" > > + > > +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE) > > + > > +static void *lcore_buffer; > > +static size_t offset = RTE_MAX_LCORE_VAR; > > + > > +static void * > > +lcore_var_alloc(size_t size, size_t align) > > +{ > > + void *handle; > > + void *value; > > + > > + offset = RTE_ALIGN_CEIL(offset, align); > > + > > + if (offset + size > RTE_MAX_LCORE_VAR) { > > +#ifdef RTE_EXEC_ENV_WINDOWS > > + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE, > > + RTE_CACHE_LINE_SIZE); > > +#else > > + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE, > > + LCORE_BUFFER_SIZE); > > +#endif > > Don't remember did that question already arise or not: > For debugging and health-checking purposes - would it make sense to link > all > lcore_buffer values into a linked list? > So user/developer/some tool can walk over it to check that provided > handle value > is really a valid lcore_var, etc.
Nice idea. Such a list, along with an accompanying dump function can be added later. > > > + RTE_VERIFY(lcore_buffer != NULL); > > + > > + offset = 0; > > + } > > + > > + handle = RTE_PTR_ADD(lcore_buffer, offset); > > + > > + offset += size; > > + > > + RTE_LCORE_VAR_FOREACH_VALUE(value, handle) > > + memset(value, 0, size); > > + > > + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with > a " > > + "%"PRIuPTR"-byte alignment", size, align); > > + > > + return handle; > > +} > > + > > +void * > > +rte_lcore_var_alloc(size_t size, size_t align) > > +{ > > + /* Having the per-lcore buffer size aligned on cache lines > > + * assures as well as having the base pointer aligned on cache > > + * size assures that aligned offsets also translate to alipgned > > + * pointers across all values. > > + */ > > + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0); > > + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE); > > + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR); > > + > > + /* '0' means asking for worst-case alignment requirements */ > > + if (align == 0) > > + align = alignof(max_align_t); > > + > > + RTE_ASSERT(rte_is_power_of_2(align)); > > + > > + return lcore_var_alloc(size, align); > > +} > > .... > > > diff --git a/lib/eal/include/rte_lcore_var.h > b/lib/eal/include/rte_lcore_var.h > > new file mode 100644 > > index 0000000000..ec3ab714a8 > > --- /dev/null > > +++ b/lib/eal/include/rte_lcore_var.h > > ... > > > +/** > > + * Given the lcore variable type, produces the type of the lcore > > + * variable handle. > > + */ > > +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \ > > + type * > > + > > +/** > > + * Define an lcore variable handle. > > + * > > + * This macro defines a variable which is used as a handle to access > > + * the various instances of a per-lcore id variable. > > + * > > + * The aim with this macro is to make clear at the point of > > + * declaration that this is an lcore handle, rather than a regular > > + * pointer. > > + * > > + * Add @b static as a prefix in case the lcore variable is only to be > > + * accessed from a particular translation unit. > > + */ > > +#define RTE_LCORE_VAR_HANDLE(type, name) \ > > + RTE_LCORE_VAR_HANDLE_TYPE(type) name > > + > > +/** > > + * Allocate space for an lcore variable, and initialize its handle. > > + * > > + * The values of the lcore variable are initialized to zero. > > + */ > > +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \ > > + handle = rte_lcore_var_alloc(size, align) > > + > > +/** > > + * Allocate space for an lcore variable, and initialize its handle, > > + * with values aligned for any type of object. > > + * > > + * The values of the lcore variable are initialized to zero. > > + */ > > +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \ > > + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0) > > + > > +/** > > + * Allocate space for an lcore variable of the size and alignment > requirements > > + * suggested by the handle pointer type, and initialize its handle. > > + * > > + * The values of the lcore variable are initialized to zero. > > + */ > > +#define RTE_LCORE_VAR_ALLOC(handle) > > \ > > + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \ > > + alignof(typeof(*(handle)))) > > + > > +/** > > + * Allocate an explicitly-sized, explicitly-aligned lcore variable by > > + * means of a @ref RTE_INIT constructor. > > + * > > + * The values of the lcore variable are initialized to zero. > > + */ > > +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \ > > + RTE_INIT(rte_lcore_var_init_ ## name) \ > > + { \ > > + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \ > > + } > > + > > +/** > > + * Allocate an explicitly-sized lcore variable by means of a @ref > > + * RTE_INIT constructor. > > + * > > + * The values of the lcore variable are initialized to zero. > > + */ > > +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \ > > + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0) > > + > > +/** > > + * Allocate an lcore variable by means of a @ref RTE_INIT > constructor. > > + * > > + * The values of the lcore variable are initialized to zero. > > + */ > > +#define RTE_LCORE_VAR_INIT(name) \ > > + RTE_INIT(rte_lcore_var_init_ ## name) \ > > + { \ > > + RTE_LCORE_VAR_ALLOC(name); \ > > + } > > + > > +/** > > + * Get void pointer to lcore variable instance with the specified > > + * lcore id. > > + * > > + * @param lcore_id > > + * The lcore id specifying which of the @c RTE_MAX_LCORE value > > + * instances should be accessed. The lcore id need not be valid > > + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the > pointer > > + * is also not valid (and thus should not be dereferenced). > > + * @param handle > > + * The lcore variable handle. > > + */ > > +static inline void * > > +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle) > > +{ > > + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR); > > +} > > + > > +/** > > + * Get pointer to lcore variable instance with the specified lcore > id. > > + * > > + * @param lcore_id > > + * The lcore id specifying which of the @c RTE_MAX_LCORE value > > + * instances should be accessed. The lcore id need not be valid > > + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the > pointer > > + * is also not valid (and thus should not be dereferenced). > > + * @param handle > > + * The lcore variable handle. > > + */ > > +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle) > > \ > > + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle)) > > + > > +/** > > + * Get pointer to lcore variable instance of the current thread. > > + * > > + * May only be used by EAL threads and registered non-EAL threads. > > + */ > > +#define RTE_LCORE_VAR_VALUE(handle) \ > > + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle) > > Would it make sense to check that rte_lcore_id() != LCORE_ID_ANY? > After all if people do not want this extra check, they can probably use > RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle) > explicitly. Not generally. I prefer keeping it brief. We could add a _SAFE variant with this extra check, like LIST_FOREACH has LIST_FOREACH_SAFE (although for a different purpose). Come to think of it: In the name of brevity, consider renaming RTE_LCORE_VAR_VALUE to RTE_LCORE_VAR. (And RTE_LCORE_VAR_FOREACH_VALUE to RTE_LCORE_VAR_FOREACH.) We want to see these everywhere in the code. > > > + > > +/** > > + * Iterate over each lcore id's value for an lcore variable. > > + * > > + * @param value > > + * A pointer successively set to point to lcore variable value > > + * corresponding to every lcore id (up to @c RTE_MAX_LCORE). > > + * @param handle > > + * The lcore variable handle. > > + */ > > +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \ > > + for (unsigned int lcore_id = \ > > + (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); > \ > > + lcore_id < RTE_MAX_LCORE; \ > > + lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, > handle)) > > Might be a bit better (and safer) to make lcore_id a macro parameter? > I.E.: > define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \ > for ((lcore_id) = ... The same thought have struck me, so I checked the scope of lcore_id. The scope of lcore_id remains limited to the for loop, i.e. it is available inside the for loop, but not after it. IMO this suffices, and lcore_id doesn't need to be a macro parameter. Maybe renaming lcore_id to _lcore_id would be an improvement, if lcore_id is already defined and used for other purposes within the for loop.