> From: Tomasz Duszynski [mailto:tduszyn...@marvell.com] > Sent: Tuesday, 29 November 2022 10.28 > > This series adds self monitoring support i.e allows to configure and > read performance measurement unit (PMU) counters in runtime without > using perf utility. This has certain adventages when application runs > on > isolated cores with nohz_full kernel parameter. > > Events can be read directly using rte_pmu_read() or using dedicated > tracepoint rte_eal_trace_pmu_read(). The latter will cause events to be > stored inside CTF file. > > By design, all enabled events are grouped together and the same group > is attached to lcores that use self monitoring funtionality. > > Events are enabled by names, which need to be read from standard > location under sysfs i.e > > /sys/bus/event_source/devices/PMU/events > > where PMU is a core pmu i.e one measuring cpu events. As of today > raw events are not supported.
Hi Thomasz, I am very interested in this patch series for fast path profiling purposes. (Not using EAL trace, but our proprietary profiler.) However, it seems that rte_pmu_read() is quite longwinded, compared to rte_pmu_pmc_read(). But perhaps I am just worrying too much, so I will ask: What is the performance cost of using rte_pmu_read() - compared to rte_pmu_pmc_read() - in the fast path? If there is a non-negligible difference, could you please provide an example of how to configure PMU events and use rte_pmu_pmc_read() in an application? I would primarily be interested in data cache misses and branch mispredictions. But feel free to make your own choices for the example.