Hi, I'm trying to make the perf tool play better with PMUs in heterogeneous systems (e.g. big.LITTLE). These patches fix some brokenness that exists today, but they require the addition of a cpumask file to each CPU PMU sysfs directory, and this happens to break prior versions of perf-stat. Due to this, I have not yet added a cpumask attribute to the ARM PMU code.
In these system we have separate logical PMUs for discrete sets of CPUs. For example, on an ARM Juno system we have a logical PMU for all Cortex-A53 CPUs, and a logical PMU for all the Cortex-A57 CPUs. The logical PMUs allow task-bound events, but reject CPU-bound events for CPUs they do not cover. Currently perf-record doesn't work for these PMUs, unless forced to use per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs not supported by a PMU, and gives up. In the presence of a cpumask, it ends up failing to mmap, as the evlist->cpus map contains a different set of cpus from the evsel->cpus map populated from the cpumask. Today's perf-stat can profile a task in the absence of a cpumask file, but in the presence of one ends up blocking after the profiled task exits. Due to an inconsistency between __perf_evsel__open and read_counter, it ends up treating some uninitialised memory as a file descriptor, and typically ends up blocked on stdin. That can avoided as in patch 1, but existing binaries would be broken by the addition of the cpumask kernel-side. I understand that we need a sysfs cpumask for the tools to work with uncore PMUs in system-wide mode, but it's not clear to me whether we expect/want a cpumask for the heterogeneous CPU PMU case, given the issue with perf-stat. Does using a sysfs cpumask to handle (heterogeneous) CPU PMUs feel like the right approach? If using a cpumask is the right approach, how can I avoid breaking existing perf-stat? Does it make sense to have a differently-named cpumask file that only new tools will look at? Thanks, Mark. Mark Rutland (3): perf stat: balance opening and reading events perf: util: Add more cpu_map helpers perf: util: only open events on CPUs an evsel permits tools/perf/builtin-stat.c | 8 ++++++-- tools/perf/util/cpumap.c | 14 ++++++++++++-- tools/perf/util/cpumap.h | 2 ++ tools/perf/util/evlist.c | 9 ++++++++- 4 files changed, 28 insertions(+), 5 deletions(-) -- 1.9.1