> From: Morten Brørup [mailto:m...@smartsharesystems.com] > Sent: Wednesday, 14 December 2022 11.41 > > +CC: Mattias, see my comment below about per-thread constructor for > this > > > From: Tomasz Duszynski [mailto:tduszyn...@marvell.com] > > Sent: Wednesday, 14 December 2022 10.39 > > > > Hello Morten, > > > > Thanks for review. Answers inline. > > > > [...] > > > > > > +__rte_experimental > > > > +static __rte_always_inline uint64_t > > > > +rte_pmu_read(int index) > > The index type can be changed from int to uint32_t. This also > eliminates the "(index < 0" part of the comparison further below in > this function. > > > > > +{ > > > > + int lcore_id = rte_lcore_id(); > > > > + struct rte_pmu_event_group *group; > > > > + int ret; > > > > + > > > > + if (!rte_pmu) > > > > + return 0; > > > > + > > > > + group = &rte_pmu->group[lcore_id]; > > > > + if (!group->enabled) { > > Optimized: if (unlikely(!group->enabled)) { > > > > > + ret = rte_pmu_enable_group(lcore_id); > > > > + if (ret) > > > > + return 0; > > > > + > > > > + group->enabled = true; > > > > + } > > > > > > Why is the group not enabled in the setup function, > > rte_pmu_add_event(), instead of here, in the > > > hot path? > > > > > > > When this is executed for the very first time then cpu will have > > obviously more work to do > > but afterwards setup path is not taken hence much less cpu cycles are > > required. > > > > Setup is executed by main lcore solely, before lcores are executed > > hence some info passed to > > SYS_perf_event_open ioctl() is missing, pid (via rte_gettid()) being > an > > example here. > > OK. Thank you for the explanation. Since impossible at setup, it has to > be done at runtime. > > @Mattias: Another good example of something that would belong in per- > thread constructors, as my suggested feature creep in [1]. > > [1]: > http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35D87553@smarts > erver.smartshare.dk/
I just realized that this initialization is per-lcore (not per thread), so you can use rte_lcore_callback_register() to register a per-lcore initialization function, and move rte_pmu_enable_group(lcore_id) there. -Morten