Mattias,
Please note that most of Thomas' questions are in the interest of the general 
public, considered requests for further documentation.


> Do you have benchmarks results of the modules using such variables
> (power, random, service)?
> It would be interesting to compare time efficiency and memory usage
> before/after, with different number of threads.

IMO, the main benefit is the reduction of waste of CPU data cache (no need for 
RTE_CACHE_GUARD fillers in per-lcore data structures).

Mattias,
The PMU counters library is too new, so I suppose you cannot yet measure the 
primary benefit, but only derived benefits.
If you have any kind of perf data, please provide them.


> > Lcore variables are similar to thread-local storage (TLS, e.g., C11
> > _Thread_local), but decoupling the values' life time with that of the
> > threads.
> 
> In which situation we need values of a dead thread?

Values of dead threads is not the issue here.
This is:
1. TLS variables are allocated and initialized when ANY thread is created, so 
using TLS for variables increases the cost of creating new threads. This is 
relevant for applications that frequently create (and destroy) short-lived 
threads.
2. TLS variables uses memory for ALL threads, regardless if those threads use 
the TLS variables or not. This increases the memory footprint for applications 
that create many (long lived) threads.

Mattias:
Thomas' question might still be relevant...
Is there any situation where the values of the lcore variables are relevant 
after the thread is dead?

> > +Variables with thread-local storage are allocated at the time of
> > +thread creation, and exists until the thread terminates, for every
> > +thread in the process. Only very small object should be allocated in
> > +TLS, since large TLS objects significantly slows down thread
> creation
> > +and may needlessly increase memory footprint for application that
> make
> > +extensive use of unregistered threads.
> 
> I don't understand the relation with non-DPDK threads.

The relationship here is the same as I described above: Using TLS for DPDK 
threads also have a cost for non-DPDK threads.


> [...]
> > +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> 
> With #define RTE_MAX_LCORE_VAR 1048576,
> LCORE_BUFFER_SIZE can be 100MB, right?

Mattias:
You should document what you have explained...
This huge amount of memory is not really consumed, but relies on paging; it is 
only a large virtual address space.
This also makes it somewhat advantageous that the lcore variables don't use 
hugepages.

> > +static size_t offset = RTE_MAX_LCORE_VAR;
> 
> A comment may be useful for this value: it triggers the first alloc?

Mattias:
If you recall, I also had a hard time understanding this design (instead of 
simply comparing lcore_buffer to NULL).
Please add a comment that this not only triggers the first allocation, but also 
additional allocations, if using a lot of memory for lcore variables.

> 
> > +
> > +static void *
> > +lcore_var_alloc(size_t size, size_t align)
> > +{
> > +   void *handle;
> > +   unsigned int lcore_id;
> > +   void *value;
> > +
> > +   offset = RTE_ALIGN_CEIL(offset, align);
> > +
> > +   if (offset + size > RTE_MAX_LCORE_VAR) {
> > +#ifdef RTE_EXEC_ENV_WINDOWS
> > +           lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
> > +                                          RTE_CACHE_LINE_SIZE);
> > +#else
> > +           lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> > +                                        LCORE_BUFFER_SIZE);
> > +#endif
> > +           RTE_VERIFY(lcore_buffer != NULL);
> 
> Please no panic in a lib.
> You can return NULL.

I agree with Mattias design here.
Lcore variables are like RTE_PER_LCORE variables and simple "static" variables.
If the system does not have enough memory for those, the application will not 
be able to deal with it.
Panic early (in this lib) is the correct way to deal with it.


> > +/**
> > + * @file
> > + *
> > + * RTE Lcore variables
> 
> Please don't say "RTE", it is just a prefix.
> You can replace it with "DPDK" if you really want to be specific.

The commit message says:
"Introduce DPDK per-lcore id variables, or lcore variables for short."

Use one of the two here.
I personally prefer the short variant, "Lcore variables", because that is the 
term we are going to use in conversations on the mailing list, documentation 
etc.
The long variant is mainly intended for explaining the library itself.


> > + * Lcore variables cannot and need not be freed.
> 
> I'm curious about that.
> If EAL is closed, and the application continues its life,
> then we want all this memory to be cleaned as well.
> Do you know rte_eal_cleanup()?

Good catch, Thomas! I missed that in my review.
Mattias, it seems you need a chained list of lcore_buffer allocations for this.


> > + *
> > + * The size of an lcore variable's value must be less than the DPDK
> 
> size of variable, not size of value

Initially, I thought the same as Thomas...
It is confusing considering the handle the variable, and its instances having 
values.

However, during the review, Mattias convinced me of its correctness.

And by the way, RTE_PER_LCORE also does it:
https://elixir.bootlin.com/dpdk/v24.07/source/lib/eal/include/rte_per_lcore.h#L48


> 
> > + * build-time constant @c RTE_MAX_LCORE_VAR.
> > + *
> > + * The lcore variable are stored in a series of lcore buffers, which
> 
> variable*s*
> 
> > + * are allocated from the libc heap. Heap allocation failures are
> > + * treated as fatal.
> 
> Why not handling as an error, so the app has a chance to cleanup before
> crash?

Because allocation failures of similar variable types (RTE_PER_LCORE and 
"static") also don't offer a chance to cleanup. Just following the same design 
pattern.


> > +static inline void *
> > +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
> 
> What a long name!
> What about rte_lcore_var() ?

+1


> > + *
> > + * @param lcore_id
> > + *   The lcore id specifying which of the @c RTE_MAX_LCORE value
> > + *   instances should be accessed. The lcore id need not be valid
> > + *   (e.g., may be @ref LCORE_ID_ANY), but in such a case, the
> pointer
> > + *   is also not valid (and thus should not be dereferenced).
> > + * @param handle
> > + *   The lcore variable handle.
> > + */
> > +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle)

I hope to see a lot of these in the code, so keeping it short would be good.

Suggest:
RTE_LCORE_VAR_LCORE(lcore_id, handle)

>       \
> > +   ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
> > +
> > +/**
> > + * Get pointer to lcore variable instance of the current thread.
> > + *
> > + * May only be used by EAL threads and registered non-EAL threads.
> > + */
> > +#define RTE_LCORE_VAR_VALUE(handle) \
> 
> RTE_LCORE_VAR_LOCAL?

Same comment as above, let's keep these popular ones short:
RTE_LCORE_VAR(handle)

> 
> > +   RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
> > +
> > +/**
> > + * Iterate over each lcore id's value for an lcore variable.
> > + *
> > + * @param lcore_id
> > + *   An <code>unsigned int</code> variable successively set to the
> > + *   lcore id of every valid lcore id (up to @c RTE_MAX_LCORE).
> > + * @param value
> > + *   A pointer variable successively set to point to lcore variable
> > + *   value instance of the current lcore id being processed.
> > + * @param handle
> > + *   The lcore variable handle.
> > + */
> > +#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle)
>       \
> 
> RTE_LCORE_VAR_FOREACH?

Generally, get rid of the _VALUE postfix.
RTE_PER_LCORE() doesn't have a _VALUE postfix, even though its description 
refers to the variable's value.

Reply via email to