On Thu, Sep 12, 2024 at 6:50 PM Mattias Rönnblom <hof...@lysator.liu.se> wrote: > > On 2024-09-12 15:09, Jerin Jacob wrote: > > On Thu, Sep 12, 2024 at 2:34 PM Mattias Rönnblom > > <mattias.ronnb...@ericsson.com> wrote: > >> > >> Add basic micro benchmark for lcore variables, in an attempt to assure > >> that the overhead isn't significantly greater than alternative > >> approaches, in scenarios where the benefits aren't expected to show up > >> (i.e., when plenty of cache is available compared to the working set > >> size of the per-lcore data). > >> > >> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > >> --- > >> app/test/meson.build | 1 + > >> app/test/test_lcore_var_perf.c | 160 +++++++++++++++++++++++++++++++++ > >> 2 files changed, 161 insertions(+) > >> create mode 100644 app/test/test_lcore_var_perf.c > > > > > >> +static double > >> +benchmark_access_method(void (*init_fun)(void), void (*update_fun)(void)) > >> +{ > >> + uint64_t i; > >> + uint64_t start; > >> + uint64_t end; > >> + double latency; > >> + > >> + init_fun(); > >> + > >> + start = rte_get_timer_cycles(); > >> + > >> + for (i = 0; i < ITERATIONS; i++) > >> + update_fun(); > >> + > >> + end = rte_get_timer_cycles(); > > > > Use precise variant. rte_rdtsc_precise() or so to be accurate > > With 1e7 iterations, do you need rte_rdtsc_precise()? I suspect not.
I was thinking in another way, with 1e7 iteration, the additional barrier on precise will be amortized, and we get more _deterministic_ behavior e.s.p in case if we print cycles and if we need to catch regressions. Furthermore, you may consider replacing rte_random() in fast path to running number or so if it is not deterministic in cycle computation.