> From: Tomasz Duszynski [mailto:tduszyn...@marvell.com]
> Sent: Tuesday, 29 November 2022 10.28
> 
> This series adds self monitoring support i.e allows to configure and
> read performance measurement unit (PMU) counters in runtime without
> using perf utility. This has certain adventages when application runs
> on
> isolated cores with nohz_full kernel parameter.
> 
> Events can be read directly using rte_pmu_read() or using dedicated
> tracepoint rte_eal_trace_pmu_read(). The latter will cause events to be
> stored inside CTF file.
> 
> By design, all enabled events are grouped together and the same group
> is attached to lcores that use self monitoring funtionality.
> 
> Events are enabled by names, which need to be read from standard
> location under sysfs i.e
> 
> /sys/bus/event_source/devices/PMU/events
> 
> where PMU is a core pmu i.e one measuring cpu events. As of today
> raw events are not supported.

Hi Thomasz,

I am very interested in this patch series for fast path profiling purposes. 
(Not using EAL trace, but our proprietary profiler.)

However, it seems that rte_pmu_read() is quite longwinded, compared to 
rte_pmu_pmc_read().

But perhaps I am just worrying too much, so I will ask: What is the performance 
cost of using rte_pmu_read() - compared to rte_pmu_pmc_read() - in the fast 
path?

If there is a non-negligible difference, could you please provide an example of 
how to configure PMU events and use rte_pmu_pmc_read() in an application?

I would primarily be interested in data cache misses and branch mispredictions. 
But feel free to make your own choices for the example.

Reply via email to