Hi,

I'm trying to make the perf tool play better with PMUs in heterogeneous systems
(e.g. big.LITTLE). These patches fix some brokenness that exists today, but
they require the addition of a cpumask file to each CPU PMU sysfs directory,
and this happens to break prior versions of perf-stat. Due to this, I have not
yet added a cpumask attribute to the ARM PMU code.

In these system we have separate logical PMUs for discrete sets of CPUs. For
example, on an ARM Juno system we have a logical PMU for all Cortex-A53 CPUs,
and a logical PMU for all the Cortex-A57 CPUs. The logical PMUs allow
task-bound events, but reject CPU-bound events for CPUs they do not cover.

Currently perf-record doesn't work for these PMUs, unless forced to use
per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs
not supported by a PMU, and gives up. In the presence of a cpumask, it ends up
failing to mmap, as the evlist->cpus map contains a different set of cpus from
the evsel->cpus map populated from the cpumask.

Today's perf-stat can profile a task in the absence of a cpumask file, but in
the presence of one ends up blocking after the profiled task exits. Due to an
inconsistency between __perf_evsel__open and read_counter, it ends up treating
some uninitialised memory as a file descriptor, and typically ends up blocked
on stdin. That can avoided as in patch 1, but existing binaries would be broken
by the addition of the cpumask kernel-side.

I understand that we need a sysfs cpumask for the tools to work with uncore
PMUs in system-wide mode, but it's not clear to me whether we expect/want a
cpumask for the heterogeneous CPU PMU case, given the issue with perf-stat.

Does using a sysfs cpumask to handle (heterogeneous) CPU PMUs feel like the
right approach?

If using a cpumask is the right approach, how can I avoid breaking existing
perf-stat? Does it make sense to have a differently-named cpumask file that
only new tools will look at?

Thanks,
Mark.

Mark Rutland (3):
  perf stat: balance opening and reading events
  perf: util: Add more cpu_map helpers
  perf: util: only open events on CPUs an evsel permits

 tools/perf/builtin-stat.c |  8 ++++++--
 tools/perf/util/cpumap.c  | 14 ++++++++++++--
 tools/perf/util/cpumap.h  |  2 ++
 tools/perf/util/evlist.c  |  9 ++++++++-
 4 files changed, 28 insertions(+), 5 deletions(-)

-- 
1.9.1

Reply via email to