> From: Morten Brørup [mailto:m...@smartsharesystems.com]
> Sent: Wednesday, 14 December 2022 11.41
> 
> +CC: Mattias, see my comment below about per-thread constructor for
> this
> 
> > From: Tomasz Duszynski [mailto:tduszyn...@marvell.com]
> > Sent: Wednesday, 14 December 2022 10.39
> >
> > Hello Morten,
> >
> > Thanks for review. Answers inline.
> >
> > [...]
> >
> > > > +__rte_experimental
> > > > +static __rte_always_inline uint64_t
> > > > +rte_pmu_read(int index)
> 
> The index type can be changed from int to uint32_t. This also
> eliminates the "(index < 0" part of the comparison further below in
> this function.
> 
> > > > +{
> > > > +       int lcore_id = rte_lcore_id();
> > > > +       struct rte_pmu_event_group *group;
> > > > +       int ret;
> > > > +
> > > > +       if (!rte_pmu)
> > > > +               return 0;
> > > > +
> > > > +       group = &rte_pmu->group[lcore_id];
> > > > +       if (!group->enabled) {
> 
> Optimized: if (unlikely(!group->enabled)) {
> 
> > > > +               ret = rte_pmu_enable_group(lcore_id);
> > > > +               if (ret)
> > > > +                       return 0;
> > > > +
> > > > +               group->enabled = true;
> > > > +       }
> > >
> > > Why is the group not enabled in the setup function,
> > rte_pmu_add_event(), instead of here, in the
> > > hot path?
> > >
> >
> > When this is executed for the very first time then cpu will have
> > obviously more work to do
> > but afterwards setup path is not taken hence much less cpu cycles are
> > required.
> >
> > Setup is executed by main lcore solely, before lcores are executed
> > hence some info passed to
> > SYS_perf_event_open ioctl() is missing, pid (via rte_gettid()) being
> an
> > example here.
> 
> OK. Thank you for the explanation. Since impossible at setup, it has to
> be done at runtime.
> 
> @Mattias: Another good example of something that would belong in per-
> thread constructors, as my suggested feature creep in [1].
> 
> [1]:
> http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35D87553@smarts
> erver.smartshare.dk/

I just realized that this initialization is per-lcore (not per thread), so you 
can use rte_lcore_callback_register() to register a per-lcore initialization 
function, and move rte_pmu_enable_group(lcore_id) there.

-Morten

Reply via email to