Re: [PATCH v8 4/7] perf/tools: Enhance JSON/metric infrastructure to handle "?"

2020-05-01 Thread Ian Rogers
On Wed, Apr 1, 2020 at 1:35 PM Kajol Jain  wrote:
>
> Patch enhances current metric infrastructure to handle "?" in the metric
> expression. The "?" can be use for parameters whose value not known while
> creating metric events and which can be replace later at runtime to
> the proper value. It also add flexibility to create multiple events out
> of single metric event added in json file.
>
> Patch adds function 'arch_get_runtimeparam' which is a arch specific
> function, returns the count of metric events need to be created.
> By default it return 1.

Sorry for the slow response, I was trying to understand this patch in
relation to the PMU aliases to see if there was an overlap - I'm still
not sure. This is now merged so I'm just commenting wrt possible
future cleanup. I defer to the maintainers on how this should be
organized. At the metric level, this problem reminds me of both
#smt_on and LLC_MISSES.PCIE_WRITE on cascade lake. #smt_on adds a
degree of CPU specific behavior to an expression.
LLC_MISSES.PCIE_WRITE uses .part0 ... part3 to combine separate but
related counters.
The symbols that the metrics parse are then passed to parse-event. You
don't change parse-event as metricgroup replaces the '?' with a read
value from /devices/hv_24x7/interface/sockets, actually 0 to that
value-1 are passed.

It seems unfortunate to overload the meaning of runtime with a value
read from /devices/hv_24x7/interface/sockets and plumbing this value
around is quite a bit of noise for everything but this use-case. I
kind of wish we could do something like:

for i in 0, read("/devices/hv_24x7/interface/sockets"):
  hv_24x7/pm_pb_cyc,chip=$i

in the metric code. I have some patches to send related to metric
groups and I think this will be an active area of development for a
while as I think there are some open questions on the organization of
the code.

Thanks,
Ian

> This infrastructure needed for hv_24x7 socket/chip level events.
> "hv_24x7" chip level events needs specific chip-id to which the
> data is requested. Function 'arch_get_runtimeparam' implemented
> in header.c which extract number of sockets from sysfs file
> "sockets" under "/sys/devices/hv_24x7/interface/".
>
> With this patch basically we are trying to create as many metric events
> as define by runtime_param.
>
> For that one loop is added in function 'metricgroup__add_metric',
> which create multiple events at run time depend on return value of
> 'arch_get_runtimeparam' and merge that event in 'group_list'.
>
> To achieve that we are actually passing this parameter value as part of
> `expr__find_other` function and changing "?" present in metric expression
> with this value.
>
> As in our json file, there gonna be single metric event, and out of
> which we are creating multiple events.
>
> To understand which data count belongs to which parameter value,
> we also printing param value in generic_metric function.
>
> For example,
> command:# ./perf stat  -M PowerBUS_Frequency -C 0 -I 1000
>  1.000101867  9,356,933  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 
> GHz  PowerBUS_Frequency_0
>  1.000101867  9,366,134  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 
> GHz  PowerBUS_Frequency_1
>  2.000314878  9,365,868  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 
> GHz  PowerBUS_Frequency_0
>  2.000314878  9,366,092  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 
> GHz  PowerBUS_Frequency_1
>
> So, here _0 and _1 after PowerBUS_Frequency specify parameter value.
>
> Signed-off-by: Kajol Jain 
> ---
>  tools/perf/arch/powerpc/util/header.c |  8 
>  tools/perf/tests/expr.c   |  8 
>  tools/perf/util/expr.c| 11 ++-
>  tools/perf/util/expr.h|  5 +++--
>  tools/perf/util/expr.l| 27 +++---
>  tools/perf/util/metricgroup.c | 28 ---
>  tools/perf/util/metricgroup.h |  2 ++
>  tools/perf/util/stat-shadow.c | 17 ++--
>  8 files changed, 79 insertions(+), 27 deletions(-)
>
> diff --git a/tools/perf/arch/powerpc/util/header.c 
> b/tools/perf/arch/powerpc/util/header.c
> index 3b4cdfc5efd6..d4870074f14c 100644
> --- a/tools/perf/arch/powerpc/util/header.c
> +++ b/tools/perf/arch/powerpc/util/header.c
> @@ -7,6 +7,8 @@
>  #include 
>  #include 
>  #include "header.h"
> +#include "metricgroup.h"
> +#include 
>
>  #define mfspr(rn)   ({unsigned long rval; \
>  asm volatile("mfspr %0," __stringify(rn) \
> @@ -44,3 +46,9 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
>
> return bufp;
>  }
> +
> +int arch_get_runtimeparam(void)
> +{
> +   int count;
> +   return sysfs__read_int("/devices/hv_24x7/interface/sockets", &count) 
> < 0 ? 1 : count;
> +}
> diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> index ea10fc4412c4..516504cf0ea5 100644
> --- a/tools/perf/tests/expr.c
> +++ b/tools/perf/tests/expr.c
> @@ -10,7 +10,7 @@ s

Re: [PATCH V2] tools/perf: Add includes for detected configs in Makefile.perf

2023-09-08 Thread Ian Rogers
On Fri, Sep 8, 2023 at 7:51 AM Athira Rajeev
 wrote:
>
> Makefile.perf uses "CONFIG_*" checks in the code. Example the config
> for libtraceevent is used to set PYTHON_EXT_SRCS
>
> ifeq ($(CONFIG_LIBTRACEEVENT),y)
>   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
> else
>   PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' 
> util/python-ext-sources)
> endif
>
> But this is not picking the value for CONFIG_LIBTRACEEVENT that is
> set using the settings in Makefile.config. Include the file
> ".config-detected" so that make will use the system detected
> configuration in the CONFIG checks. This will fix isues that
> could arise when other "CONFIG_*" checks are added to Makefile.perf
> in future as well.
>
> Signed-off-by: Athira Rajeev 
> ---
> Changelog:
>  v1 -> v2:
>  Added $(OUTPUT) prefix to config-detected as pointed
>  out by Ian
>
>  tools/perf/Makefile.perf | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 37af6df7b978..66b9dc61c32f 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
>
>  python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
> $(OUTPUT)python/perf*.so
>
> +# Use the detected configuration
> +include $(OUTPUT).config-detected

The Makefile.build version also has a "-include" rather than "include"
in case the .config-detected file is missing. In Makefile.perf
including Makefile.config is optional:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/Makefile.perf?h=perf-tools-next#n253

and there are certain targets that where we don't include it:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/Makefile.perf?h=perf-tools-next#n200

So playing devil's advocate, if we ran "make clean" we'd remove
.config-detected:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/Makefile.perf?h=perf-tools-next#n1131

If we then ran "make tags" then we wouldn't include Makefile.config
and so .config-detected wouldn't be generated and I think the build
would fail due to a missing include here. So I think this should be
-include or perhaps:

ifeq ($(config),1)
include $(OUTPUT).config-detected
endif

Thanks,
Ian

> +
>  ifeq ($(CONFIG_LIBTRACEEVENT),y)
>PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
>  else
> --
> 2.31.1
>


Re: [PATCH V3] tools/perf: Add includes for detected configs in Makefile.perf

2023-09-12 Thread Ian Rogers
On Mon, Sep 11, 2023 at 11:38 PM Athira Rajeev
 wrote:
>
> Makefile.perf uses "CONFIG_*" checks in the code. Example the config
> for libtraceevent is used to set PYTHON_EXT_SRCS
>
> ifeq ($(CONFIG_LIBTRACEEVENT),y)
>   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
> else
>   PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' 
> util/python-ext-sources)
> endif
>
> But this is not picking the value for CONFIG_LIBTRACEEVENT that is
> set using the settings in Makefile.config. Include the file
> ".config-detected" so that make will use the system detected
> configuration in the CONFIG checks. This will fix isues that
> could arise when other "CONFIG_*" checks are added to Makefile.perf
> in future as well.
>
> Signed-off-by: Athira Rajeev 

Reviewed-by: Ian Rogers 

Thanks,
Ian

> ---
> Changelog:
> v2 -> v3:
> Added -include since in some cases make clean or make
> will fail when config is not included and if config-detected
> file is not present.
>
> v1 -> v2:
> Added $(OUTPUT) prefix to config-detected as pointed
> out by Ian
>
>  tools/perf/Makefile.perf | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 37af6df7b978..f6fdc2d5a92f 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
>
>  python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
> $(OUTPUT)python/perf*.so
>
> +# Use the detected configuration
> +-include $(OUTPUT).config-detected
> +
>  ifeq ($(CONFIG_LIBTRACEEVENT),y)
>PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
>  else
> --
> 2.31.1
>


Re: [PATCH 17/17] perf tests task_analyzer: skip tests if no libtraceevent support

2023-07-12 Thread Ian Rogers
On Tue, Jun 13, 2023 at 10:04 AM Athira Rajeev
 wrote:
>
> From: Aditya Gupta 
>
> Test "perf script task-analyzer tests" fails in environment with missing
> libtraceevent support, as perf record fails to create the perf.data
> file, which further tests depend on.
>
> Instead, when perf is not compiled with libtraceevent support, skip those
> tests instead of failing them, by checking the output of `perf
> record --dry-run` to see if it prints the error "libtraceevent is
> necessary for tracepoint support"
>
> For the following output, perf compiled with: `make NO_LIBTRACEEVENT=1`
>
> Before the patch:
>
> 108: perf script task-analyzer tests :
> test child forked, pid 24105
> failed to open perf.data: No such file or directory  (try 'perf record' first)
> FAIL: "invokation of perf script report task-analyzer command failed" Error 
> message: ""
> FAIL: "test_basic" Error message: "Failed to find required string:'Comm'."
> failed to open perf.data: No such file or directory  (try 'perf record' first)
> FAIL: "invokation of perf script report task-analyzer --ns 
> --rename-comms-by-tids 0:random command failed" Error message: ""
> FAIL: "test_ns_rename" Error message: "Failed to find required string:'Comm'."
> failed to open perf.data: No such file or directory  (try 'perf record' first)
> <...>
> perf script task-analyzer tests: FAILED!
>
> With this patch, the script instead returns 2 signifying SKIP, and after
> the patch:
>
> 108: perf script task-analyzer tests :
> test child forked, pid 26010
> libtraceevent is necessary for tracepoint support
> WARN: Skipping tests. No libtraceevent support
> test child finished with -2
> perf script task-analyzer tests: Skip
>
> Fixes: e8478b84d6ba ("perf test: add new task-analyzer tests")
> Signed-off-by: Athira Rajeev 
> Signed-off-by: Kajol Jain 
> Signed-off-by: Aditya Gupta 
> ---
>  tools/perf/tests/shell/test_task_analyzer.sh | 18 ++
>  1 file changed, 18 insertions(+)
>
> diff --git a/tools/perf/tests/shell/test_task_analyzer.sh 
> b/tools/perf/tests/shell/test_task_analyzer.sh
> index b094eeb3bf66..59785dfc11f8 100755
> --- a/tools/perf/tests/shell/test_task_analyzer.sh
> +++ b/tools/perf/tests/shell/test_task_analyzer.sh
> @@ -44,9 +44,20 @@ find_str_or_fail() {
> fi
>  }
>
> +# check if perf is compiled with libtraceevent support
> +skip_no_probe_record_support() {
> +   perf record -e "sched:sched_switch" -a -- sleep 1 2>&1 | grep 
> "libtraceevent is necessary for tracepoint support" && return 2

Fwiw, another way to detect build options used in other shell tests is:
perf version --build-options | grep HAVE_LIBTRACEEVENT | grep -q OFF && return 2

Thanks,
Ian

> +   return 0
> +}
> +
>  prepare_perf_data() {
> # 1s should be sufficient to catch at least some switches
> perf record -e sched:sched_switch -a -- sleep 1 > /dev/null 2>&1
> +   # check if perf data file got created in above step.
> +   if [ ! -e "perf.data" ]; then
> +   printf "FAIL: perf record failed to create \"perf.data\" \n"
> +   return 1
> +   fi
>  }
>
>  # check standard inkvokation with no arguments
> @@ -134,6 +145,13 @@ test_csvsummary_extended() {
> find_str_or_fail "Out-Out;" csvsummary "${FUNCNAME[0]}"
>  }
>
> +skip_no_probe_record_support
> +err=$?
> +if [ $err -ne 0 ]; then
> +   echo "WARN: Skipping tests. No libtraceevent support"
> +   cleanup
> +   exit $err
> +fi
>  prepare_perf_data
>  test_basic
>  test_ns_rename
> --
> 2.39.1
>


Re: [PATCH V2 00/26] tools/perf: Fix shellcheck coding/formatting issues of perf tool shell scripts

2023-07-19 Thread Ian Rogers
On Tue, Jul 18, 2023 at 11:17 PM kajoljain  wrote:
>
> Hi,
>
> Looking for review comments on this patchset.
>
> Thanks,
> Kajol Jain
>
>
> On 7/9/23 23:57, Athira Rajeev wrote:
> > Patchset covers a set of fixes for coding/formatting issues observed while
> > running shellcheck tool on the perf shell scripts.
> >
> > This cleanup is a pre-requisite to include a build option for shellcheck
> > discussed here: https://www.spinics.net/lists/linux-perf-users/msg25553.html
> > First set of patches were posted here:
> > https://lore.kernel.org/linux-perf-users/53b7d823-1570-4289-a632-2205ee2b5...@linux.vnet.ibm.com/T/#t
> >
> > This patchset covers remaining set of shell scripts which needs
> > fix. Patch 1 is resubmission of patch 6 from the initial series.
> > Patch 15, 16 and 22 touches code from tools/perf/trace/beauty.
> > Other patches are fixes for scripts from tools/perf/tests.
> >
> > The shellcheck is run for severity level for errors and warnings.
> > Command used:
> >
> > # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
> > warning $F; done
> > # echo $?
> > 0
> >

I don't see anything objectionable in the changes so for the series:
Acked-by: Ian Rogers 

Some thoughts:
 - Adding "#!/bin/bash" to scripts in tools/perf/tests/lib - I think
we didn't do this to avoid these being included as tests. There are
now extra checks when finding shell tests, so I can imagine doing this
isn't a regression but just a heads up.
 - I think James' comment was addressed:
https://lore.kernel.org/linux-perf-users/334989bf-5501-494c-f246-81878fd2f...@arm.com/
 - Why aren't these changes being mailed to LKML? The wider community
on LKML have thoughts on shell scripts, plus it makes the changes miss
my mail filters.
 - Can we automate this testing into the build? For example, following
a similar kernel build pattern we run a python test and make the log
output a requirement here:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/Build?h=perf-tools-next#n30
   I think we can translate:
for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck
-S warning $F; done
   into a rule in make for log files that are then a dependency on the
perf binary. We can then parallel shellcheck during the build and
avoid regressions. We probably need a CONFIG_SHELLCHECK feature check
in the build to avoid not having shellcheck breaking the build.

Thanks,
Ian

> > Changelog:
> > v1 -> v2:
> >   - Rebased on top of perf-tools-next from:
> >   
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf-tools-next
> >
> >   - Fixed shellcheck errors and warnings reported for newly
> > added changes from perf-tools-next branch
> >
> >   - Addressed review comment from James clark for patch
> > number 13 from V1. The changes in patch 13 were not necessary
> > since the file "tests/shell/lib/coresight.sh" is sourced from
> > other test files.
> >
> > Akanksha J N (1):
> >   tools/perf/tests: Fix shellcheck warnings for
> > trace+probe_vfs_getname.sh
> >
> > Athira Rajeev (14):
> >   tools/perf/tests: fix test_arm_spe_fork.sh signal case issues
> >   tools/perf/tests: Fix unused variable references in
> > stat+csv_summary.sh testcase
> >   tools/perf/tests: fix shellcheck warning for
> > test_perf_data_converter_json.sh testcase
> >   tools/perf/tests: Fix shellcheck issue for stat_bpf_counters.sh
> > testcase
> >   tools/perf/tests: Fix shellcheck issues in
> > tests/shell/stat+shadow_stat.sh tetscase
> >   tools/perf/tests: Fix shellcheck warnings for
> > thread_loop_check_tid_10.sh
> >   tools/perf/tests: Fix shellcheck warnings for unroll_loop_thread_10.sh
> >   tools/perf/tests: Fix shellcheck warnings for lib/probe_vfs_getname.sh
> >   tools/perf/tests: Fix the shellcheck warnings in lib/waiting.sh
> >   tools/perf/trace: Fix x86_arch_prctl.sh to address shellcheck warnings
> >   tools/perf/arch/x86: Fix syscalltbl.sh to address shellcheck warnings
> >   tools/perf/tests/shell: Fix the shellcheck warnings in
> > record+zstd_comp_decomp.sh
> >   tools/perf/tests/shell: Fix shellcheck warning for stat+std_output.sh
> > testcase
> >   tools/perf/tests: Fix shellcheck warning for stat+std_output.sh
> > testcase
> >
> > Kajol Jain (11):
> >   tools/perf/tests: Fix shellcheck warning for probe_vfs_getname.sh
> > testcase
> >   tools/perf/tests: Fix shellcheck warning for record_offcpu.sh test

Re: [PATCH 1/1] perf tests task_analyzer: Check perf build options for libtraceevent support

2023-07-28 Thread Ian Rogers
On Fri, Jul 28, 2023 at 7:54 AM Arnaldo Carvalho de Melo
 wrote:
>
> Em Tue, Jul 25, 2023 at 11:46:49AM +0530, Aditya Gupta escreveu:
> > Currently we depend on output of 'perf record -e "sched:sched_switch"', to
> > check whether perf was built with libtraceevent support.
> >
> > Instead, a more straightforward approach can be to check the build options,
> > using 'perf version --build-options', to check for libtraceevent support.
> >
> > When perf is compiled WITHOUT libtraceevent ('make NO_LIBTRACEEVENT=1'),
> > 'perf version --build-options' outputs (output trimmed):
> >
> >...
> >  libtraceevent: [ OFF ]  # HAVE_LIBTRACEEVENT
> >...
> >
> > While, when perf is compiled WITH libtraceevent,
> >
> > 'perf version --build-options' outputs:
> >
> > ...
> >  libtraceevent: [ on ]  # HAVE_LIBTRACEEVENT
> >...
> >
> > Suggested-by: Ian Rogers 
> > Signed-off-by: Aditya Gupta 
> > ---
> >
> >  tools/perf/tests/shell/test_task_analyzer.sh | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/tests/shell/test_task_analyzer.sh 
> > b/tools/perf/tests/shell/test_task_analyzer.sh
> > index 0095abbe20ca..a28d784987b4 100755
> > --- a/tools/perf/tests/shell/test_task_analyzer.sh
> > +++ b/tools/perf/tests/shell/test_task_analyzer.sh
> > @@ -52,7 +52,7 @@ find_str_or_fail() {
> >
> >  # check if perf is compiled with libtraceevent support
> >  skip_no_probe_record_support() {
> > - perf record -e "sched:sched_switch" -a -- sleep 1 2>&1 | grep 
> > "libtraceevent is necessary for tracepoint support" && return 2
> > + perf version --build-options | grep HAVE_LIBTRACEEVENT | grep -q OFF 
> > && return 2
> >   return 0
>
> I'll apply this, but please consider adding a:
>
> perf build --has libtraceevent

That's a nice idea. You mean add a script like perf-archive.sh?
Perhaps this flag should be supported by perf version instead.

Thanks,
Ian

> subcommand to have that query made more compact and to avoid the two
> extra grep.
>
> BTW, I'll change that to:
>
> [acme@quaco perf-tools-next]$ perf version --build-options | grep " on .* 
> HAVE_LIBTRACEEVENT"
>  libtraceevent: [ on  ]  # HAVE_LIBTRACEEVENT
> [acme@quaco perf-tools-next]$
>
> replacing "on" with OFF, so that we have just one grep.
>
> Thanks,
>
> - Arnaldo
>
> >  }
> >
> > --
> > 2.41.0
> >
>
> --
>
> - Arnaldo


Re: [PATCH] tools/perf: Fix bpf__probe to set bpf_prog_type type only if differs from the desired one

2023-08-17 Thread Ian Rogers
On Thu, Aug 17, 2023 at 10:35 AM Athira Rajeev
 wrote:
>
>
>
> > On 07-Aug-2023, at 11:07 AM, Sachin Sant  wrote:
> >
> >
> >
> >> On 07-Aug-2023, at 10:22 AM, Athira Rajeev  
> >> wrote:
> >>
> >> The test "BPF prologue generation" fails as below:
> >>
> >>  Writing event: p:perf_bpf_probe/func _text+10423200 f_mode=+20(%gpr3):x32 
> >> offset=%gpr4:s64 orig=%gpr5:s32
> >>  In map_prologue, ntevs=1
> >>  mapping[0]=0
> >>  libbpf: prog 'bpf_func__null_lseek': BPF program load failed: Permission 
> >> denied
> >>  libbpf: prog 'bpf_func__null_lseek': -- BEGIN PROG LOAD LOG --
> >>  btf_vmlinux is malformed
> >>  reg type unsupported for arg#0 function bpf_func__null_lseek#5
> >>  0: R1=ctx(off=0,imm=0) R10=fp0
> >>  ;
> >>  0: (57) r3 &= 2
> >>  R3 !read_ok
> >>  processed 1 insns (limit 100) max_states_per_insn 0 total_states 0 
> >> peak_states 0 mark_read 0
> >>  -- END PROG LOAD LOG --
> >>  libbpf: prog 'bpf_func__null_lseek': failed to load: -13
> >>  libbpf: failed to load object '[bpf_prologue_test]'
> >>  bpf: load objects failed: err=-13: (Permission denied)
> >>  Failed to add events selected by BPF
> >>
> >> This fails occurs after this commit:
> >> commit d6e6286a12e7 ("libbpf: disassociate section handler
> >> on explicit bpf_program__set_type() call")'
> >>
> >> With this change, SEC_DEF handler libbpf which is determined
> >> initially based on program's SEC() is set to NULL. The change
> >> is made because sec_def is not valid when user sets the program
> >> type with bpf_program__set_type function. This commit also fixed
> >> bpf_prog_test_load() helper in selftests/bpf to force-set program
> >> type only if it differs from the desired one.
> >>
> >> The "bpf__probe" function in util/bpf-loader.c, also calls
> >> bpf_program__set_type to set bpf_prog_type. Add similar fix in
> >> here as well to avoid setting sec_def to NULL.
> >>
> >> Reported-by: Sachin Sant 
> >> Signed-off-by: Athira Rajeev 
> >> ---
> >
> > Thanks Athira for the fix.
> > With this patch applied perf BPF prologue sub test works correctly.
> >
> > 42: BPF filter :
> > 42.1: Basic BPF filtering: Ok
> > 42.2: BPF pinning  : Ok
> > 42.3: BPF prologue generation  : Ok
> >
> > Tested-by: Sachin Sant 
> >
> > Can you please use the above mentioned id(without vnet) in the reported-by ?
> >
> > - Sachin
>
> Hi All,
>
> Looking for review comments on this patch
>
> Athira

Hi,

the patch set:
https://lore.kernel.org/lkml/20230810184853.2860737-1-irog...@google.com/
removes the affected code/test.

Thanks,
Ian


Re: [PATCH] perf test: Fix parse-events tests to skip parametrized events

2023-08-17 Thread Ian Rogers
On Sun, Aug 6, 2023 at 9:50 PM Athira Rajeev
 wrote:
>
> Testcase "Parsing of all PMU events from sysfs" parse events for
> all PMUs, and not just cpu. In case of powerpc, the PowerVM
> environment supports events from hv_24x7 and hv_gpci PMU which
> is of example format like below:
>
> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
> - hv_gpci/event,partition_id=?/
>
> The value for "?" needs to be filled in depending on system
> configuration. It is better to skip these parametrized events
> in this test as it is done in:
> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
> parametrized events")' which handled a simialr instance with
> "all PMU test".

I'd say this is different, the "?" is really ugly. On other
architectures the problem is solved by having >1 PMU, domain and core
can be meta-data associated with the PMU. If we want to aggregate
based on domain and core in the perf tool, it will need a different
way of solving the problem for Power. Skipping the test is just
pushing this problem down the road.

> Fix parse-events test to skip parametrized events since
> it needs proper setup of the parameters.
>
> Signed-off-by: Athira Rajeev 
> ---
>  tools/perf/tests/parse-events.c | 32 
>  1 file changed, 32 insertions(+)
>
> diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
> index b2f82847e4c3..605373c7d005 100644
> --- a/tools/perf/tests/parse-events.c
> +++ b/tools/perf/tests/parse-events.c
> @@ -2504,7 +2504,11 @@ static int test__pmu_events(struct test_suite *test 
> __maybe_unused, int subtest
> while ((pmu = perf_pmus__scan(pmu)) != NULL) {
> struct stat st;
> char path[PATH_MAX];
> +   char pmu_event[PATH_MAX + 256];

By definition paths can't be longer than PATH_MAX.

> +   char *buf = NULL;
> +   FILE *file;
> struct dirent *ent;
> +   size_t len = 0;
> DIR *dir;
> int err;
>
> @@ -2528,11 +2532,39 @@ static int test__pmu_events(struct test_suite *test 
> __maybe_unused, int subtest
> struct evlist_test e = { .name = NULL, };
> char name[2 * NAME_MAX + 1 + 12 + 3];
> int test_ret;
> +   int skip = 0;

Prefer a boolean. Prefer is_event_parameterized over skip to make
variable name more intention revealing.

>
> /* Names containing . are special and cannot be used 
> directly */
> if (strchr(ent->d_name, '.'))
> continue;
>
> +   /* exclude parametrized ones (name contains '?') */
> +   snprintf(pmu_event, PATH_MAX + 256, "%s%s", path, 
> ent->d_name);

Use sizeof(pmu_event) rather than "PATH_MAX + 256".

Thanks,
Ian

> +   file = fopen(pmu_event, "r");
> +   if (!file) {
> +   pr_debug("can't open pmu event file for 
> '%s'\n", ent->d_name);
> +   ret = combine_test_results(ret, TEST_FAIL);
> +   continue;
> +   }
> +
> +   if (getline(&buf, &len, file) < 0) {
> +   pr_debug(" pmu event: %s is a null event\n", 
> ent->d_name);
> +   ret = combine_test_results(ret, TEST_FAIL);
> +   continue;
> +   }
> +
> +   if (strchr(buf, '?'))
> +   skip = 1;
> +
> +   free(buf);
> +   buf = NULL;
> +   fclose(file);
> +
> +   if (skip == 1) {
> +   pr_debug("skipping parametrized PMU event: %s 
> which contains ?\n", pmu_event);
> +   continue;
> +   }
> +
> snprintf(name, sizeof(name), "%s/event=%s/u", 
> pmu->name, ent->d_name);
>
> e.name  = name;
> --
> 2.31.1
>


Re: [PATCH] perf test: Skip perf bench breakpoint run if no breakpoints available

2023-08-29 Thread Ian Rogers
On Wed, Aug 23, 2023 at 4:00 AM Naveen N Rao  wrote:
>
> Hi Kajol,
>
> On Wed Aug 23, 2023 at 1:21 PM IST, Kajol Jain wrote:
> > Based on commit 7d54a4acd8c1 ("perf test: Skip watchpoint
> > tests if no watchpoints available"), hardware breakpoints
> > are not available for power9 platform and because of that
> > perf bench breakpoint run fails on power9 platform.
> > Add code to check for the return value of perf_event_open()
> > in breakpoint run and skip the perf bench breakpoint run,
> > if hardware breakpoints are not available.
> >
> > Result on power9 system before patch changes:
> > [command]# perf bench breakpoint thread
> > perf_event_open: No such device
> >
> > Result on power9 system after patch changes:
> > [command]# ./perf bench breakpoint thread
> > Skipping perf bench breakpoint thread: No hardware support
> >
> > Reported-by: Disha Goel 
> > Signed-off-by: Kajol Jain 
> > ---
> >  tools/perf/bench/breakpoint.c | 24 +---
> >  1 file changed, 21 insertions(+), 3 deletions(-)
>
> Thanks for fixing this to not report an error. A minor nit below, but
> otherwise:
> Acked-by: Naveen N Rao 
>
> >
> > diff --git a/tools/perf/bench/breakpoint.c b/tools/perf/bench/breakpoint.c
> > index 41385f89ffc7..dfd18f5db97d 100644
> > --- a/tools/perf/bench/breakpoint.c
> > +++ b/tools/perf/bench/breakpoint.c
> > @@ -47,6 +47,7 @@ struct breakpoint {
> >  static int breakpoint_setup(void *addr)
> >  {
> >   struct perf_event_attr attr = { .size = 0, };
> > + int fd;
> >
> >   attr.type = PERF_TYPE_BREAKPOINT;
> >   attr.size = sizeof(attr);
> > @@ -56,7 +57,12 @@ static int breakpoint_setup(void *addr)
> >   attr.bp_addr = (unsigned long)addr;
> >   attr.bp_type = HW_BREAKPOINT_RW;
> >   attr.bp_len = HW_BREAKPOINT_LEN_1;
> > - return syscall(SYS_perf_event_open, &attr, 0, -1, -1, 0);
> > + fd = syscall(SYS_perf_event_open, &attr, 0, -1, -1, 0);
> > +
> > + if (fd < 0)
> > + fd = -errno;
> > +
> > + return fd;
> >  }
> >
> >  static void *passive_thread(void *arg)
> > @@ -122,8 +128,14 @@ int bench_breakpoint_thread(int argc, const char 
> > **argv)
> >
> >   for (i = 0; i < thread_params.nbreakpoints; i++) {
> >   breakpoints[i].fd = breakpoint_setup(&breakpoints[i].watched);
> > - if (breakpoints[i].fd == -1)
> > +
> > + if (breakpoints[i].fd < 0) {
> > + if (breakpoints[i].fd == -ENODEV) {
> > + printf("Skipping perf bench breakpoint 
> > thread: No hardware support\n");
> > + return 0;
>
> Should we instead do 'exit(0)' here to stop further benchmarks? Perhaps:
>   err(EXIT_SUCCESS, "Skipping perf bench breakpoint thread: No hardware 
> support");
>
> EXIT_SUCCESS looks weird, but should help document that this is not an
> error.

In tools/perf/tests/tests.h is:

enum {
   TEST_OK   =  0,
   TEST_FAIL = -1,
   TEST_SKIP = -2,
};

So I think the EXIT_SUCCESS/0 should really be TEST_OK, but I think it
would clearer if these cases were TEST_SKIP.

Thanks,
Ian

> > + }
> >   exit((perror("perf_event_open"), EXIT_FAILURE));
> > + }
> >   }
> >   gettimeofday(&start, NULL);
> >   for (i = 0; i < thread_params.nparallel; i++) {
> > @@ -196,8 +208,14 @@ int bench_breakpoint_enable(int argc, const char 
> > **argv)
> >   exit(EXIT_FAILURE);
> >   }
> >   fd = breakpoint_setup(&watched);
> > - if (fd == -1)
> > +
> > + if (fd < 0) {
> > + if (fd == -ENODEV) {
> > + printf("Skipping perf bench breakpoint enable: No 
> > hardware support\n");
> > + return 0;
>
> Here too.
>
> - Naveen
>
> > + }
> >   exit((perror("perf_event_open"), EXIT_FAILURE));
> > + }
> >   nthreads = enable_params.npassive + enable_params.nactive;
> >   threads = calloc(nthreads, sizeof(threads[0]));
> >   if (!threads)
>


Re: [PATCH V2] perf test: Fix parse-events tests to skip parametrized events

2023-09-07 Thread Ian Rogers
On Thu, Sep 7, 2023 at 9:59 AM Athira Rajeev
 wrote:
>
> Testcase "Parsing of all PMU events from sysfs" parse events for
> all PMUs, and not just cpu. In case of powerpc, the PowerVM
> environment supports events from hv_24x7 and hv_gpci PMU which
> is of example format like below:
>
> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
> - hv_gpci/event,partition_id=?/
>
> The value for "?" needs to be filled in depending on system
> configuration. It is better to skip these parametrized events
> in this test as it is done in:
> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
> parametrized events")' which handled a simialr instance with
> "all PMU test".
>
> Fix parse-events test to skip parametrized events since
> it needs proper setup of the parameters.
>
> Signed-off-by: Athira Rajeev 

Tested-by: Ian Rogers 

Thanks,
Ian

> ---
> Changelog:
> v1 -> v2:
>  Addressed review comments from Ian. Updated size of
>  pmu event name variable and changed bool name which is
>  used to skip the test.
>
>  tools/perf/tests/parse-events.c | 38 +
>  1 file changed, 38 insertions(+)
>
> diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
> index 658fb9599d95..1ecaeceb69f8 100644
> --- a/tools/perf/tests/parse-events.c
> +++ b/tools/perf/tests/parse-events.c
> @@ -2514,9 +2514,14 @@ static int test__pmu_events(struct test_suite *test 
> __maybe_unused, int subtest
> while ((pmu = perf_pmus__scan(pmu)) != NULL) {
> struct stat st;
> char path[PATH_MAX];
> +   char pmu_event[PATH_MAX];
> +   char *buf = NULL;
> +   FILE *file;
> struct dirent *ent;
> +   size_t len = 0;
> DIR *dir;
> int err;
> +   int n;
>
> snprintf(path, PATH_MAX, 
> "%s/bus/event_source/devices/%s/events/",
> sysfs__mountpoint(), pmu->name);
> @@ -2538,11 +2543,44 @@ static int test__pmu_events(struct test_suite *test 
> __maybe_unused, int subtest
> struct evlist_test e = { .name = NULL, };
> char name[2 * NAME_MAX + 1 + 12 + 3];
> int test_ret;
> +   bool is_event_parameterized = 0;
>
> /* Names containing . are special and cannot be used 
> directly */
> if (strchr(ent->d_name, '.'))
> continue;
>
> +   /* exclude parametrized ones (name contains '?') */
> +   n = snprintf(pmu_event, sizeof(pmu_event), "%s%s", 
> path, ent->d_name);
> +   if (n >= PATH_MAX) {
> +   pr_err("pmu event name crossed PATH_MAX(%d) 
> size\n", PATH_MAX);
> +   continue;
> +   }
> +
> +   file = fopen(pmu_event, "r");
> +   if (!file) {
> +   pr_debug("can't open pmu event file for 
> '%s'\n", ent->d_name);
> +   ret = combine_test_results(ret, TEST_FAIL);
> +   continue;
> +   }
> +
> +   if (getline(&buf, &len, file) < 0) {
> +   pr_debug(" pmu event: %s is a null event\n", 
> ent->d_name);
> +   ret = combine_test_results(ret, TEST_FAIL);
> +   continue;
> +   }
> +
> +   if (strchr(buf, '?'))
> +   is_event_parameterized = 1;
> +
> +   free(buf);
> +   buf = NULL;
> +   fclose(file);
> +
> +   if (is_event_parameterized == 1) {
> +   pr_debug("skipping parametrized PMU event: %s 
> which contains ?\n", pmu_event);
> +   continue;
> +   }
> +
> snprintf(name, sizeof(name), "%s/event=%s/u", 
> pmu->name, ent->d_name);
>
> e.name  = name;
> --
> 2.31.1
>


Re: [PATCH] tools/perf: Add includes for detected configs in Makefile.perf

2023-09-07 Thread Ian Rogers
On Thu, Sep 7, 2023 at 10:19 AM Athira Rajeev
 wrote:
>
> Makefile.perf uses "CONFIG_*" checks in the code. Example the config
> for libtraceevent is used to set PYTHON_EXT_SRCS
>
> ifeq ($(CONFIG_LIBTRACEEVENT),y)
>   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
> else
>   PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' 
> util/python-ext-sources)
> endif
>
> But this is not picking the value for CONFIG_LIBTRACEEVENT that is
> set using the settings in Makefile.config. Include the file
> ".config-detected" so that make will use the system detected
> configuration in the CONFIG checks. This will fix isues that
> could arise when other "CONFIG_*" checks are added to Makefile.perf
> in future as well.
>
> Signed-off-by: Athira Rajeev 
> ---
>  tools/perf/Makefile.perf | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 37af6df7b978..6764b0e156f4 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
>
>  python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
> $(OUTPUT)python/perf*.so
>
> +# Use the detected configuration
> +include .config-detected

Good catch! I think it should look like:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/build/Makefile.build?h=perf-tools-next#n40

Thanks,
Ian

> +
>  ifeq ($(CONFIG_LIBTRACEEVENT),y)
>PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
>  else
> --
> 2.31.1
>


Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"

2023-09-07 Thread Ian Rogers
On Thu, Sep 7, 2023 at 10:17 AM Athira Rajeev
 wrote:
>
> From: root 
>
> shellcheck was run on perf tool shell scripts s a pre-requisite
> to include a build option for shellcheck discussed here:
> https://www.spinics.net/lists/linux-perf-users/msg25553.html
>
> And fixes were added for the coding/formatting issues in
> two patchsets:
> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/
> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/
>
> Three additional issues are observed with shellcheck "0.6" and
> this patchset covers those. With this patchset,
>
> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
> warning $F; done
> # echo $?
> 0
>
> Athira Rajeev (3):
>   tests/shell: Fix shellcheck SC1090 to handle the location of sourced
> files
>   tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh
> tetscase
>   tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts

Series:
Tested-by: Ian Rogers 

Thanks,
Ian

>  tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 
>  tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 
>  tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 
>  tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh  | 4 
>  tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 
>  tools/perf/tests/shell/probe_vfs_getname.sh  | 2 ++
>  tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++
>  tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++
>  tools/perf/tests/shell/record.sh | 1 +
>  tools/perf/tests/shell/stat+csv_output.sh| 1 +
>  tools/perf/tests/shell/stat+csv_summary.sh   | 4 ++--
>  tools/perf/tests/shell/stat+shadow_stat.sh   | 4 ++--
>  tools/perf/tests/shell/stat+std_output.sh| 1 +
>  tools/perf/tests/shell/test_intel_pt.sh  | 1 +
>  tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 +
>  15 files changed, 35 insertions(+), 4 deletions(-)
>
> --
> 2.31.1
>


Re: [PATCH 1/2] tools/perf/tests: Fix string substitutions in build id test

2022-09-22 Thread Ian Rogers
On Thu, Sep 22, 2022 at 12:15 PM Arnaldo Carvalho de Melo
 wrote:
>
> Em Wed, Sep 21, 2022 at 10:38:38PM +0530, Athira Rajeev escreveu:
> > The perf test named “build id cache operations” skips with below
> > error on some distros:
>
> I wonder if we shouldn't instead state that bash is needed?
>
> ⬢[acme@toolbox perf-urgent]$ head -1 tools/perf/tests/shell/*.sh | grep ^#
> #!/bin/sh
> #!/bin/bash
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/bash
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/bash
> #!/bin/sh
> #!/bin/bash
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> #!/bin/sh
> ⬢[acme@toolbox perf-urgent]$
>
> Opinions?

+1 to bash. Perhaps python longer term?  The XML test output generated
by things like kunit is possible to generate from either bash or
python, but in my experience the python stuff feels better built.

Thanks,
Ian

> - Arnaldo
>
> > <<>>
> >  78: build id cache operations   :
> > test child forked, pid 01
> > WARNING: wine not found. PE binaries will not be run.
> > test binaries: /tmp/perf.ex.SHA1.PKz /tmp/perf.ex.MD5.Gt3 
> > ./tests/shell/../pe-file.exe
> > DEBUGINFOD_URLS=
> > Adding 4abd406f041feb4f10ecde3fc30fd0639e1a91cb /tmp/perf.ex.SHA1.PKz: Ok
> > build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
> > ./tests/shell/buildid.sh: 69: ./tests/shell/buildid.sh: Bad substitution
> > test child finished with -2
> > build id cache operations: Skip
> > <<>>
> >
> > The test script "tests/shell/buildid.sh" uses some of the
> > string substitution ways which are supported in bash, but not in
> > "sh" or other shells. Above error on line number 69 that reports
> > "Bad substitution" is:
> >
> > <<>>
> > link=${build_id_dir}/.build-id/${id:0:2}/${id:2}
> > <<>>
> >
> > Here the way of getting first two characters from id ie,
> > ${id:0:2} and similarly expressions like ${id:2} is not
> > recognised in "sh". So the line errors and instead of
> > hitting failure, the test gets skipped as shown in logs.
> > So the syntax issue causes test not to be executed in
> > such cases. Similarly usage : "${@: -1}" [ to pick last
> > argument passed to a function] in “test_record” doesn’t
> > work in all distros.
> >
> > Fix this by using alternative way with "cut" command
> > to pick "n" characters from the string. Also fix the usage
> > of “${@: -1}” to work in all cases.
> >
> > Another usage in “test_record” is:
> > <<>>
> > ${perf} record --buildid-all -o ${data} $@ &> ${log}
> > <<>>
> >
> > This causes the perf record to start in background and
> > Results in the data file not being created by the time
> > "check" function is invoked. Below log shows perf record
> > result getting displayed after the call to "check" function.
> >
> > <<>>
> > running: perf record /tmp/perf.ex.SHA1.EAU
> > build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
> > link: 
> > /tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb
> > failed: link 
> > /tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb 
> > does not exist
> > test child finished with -1
> > build id cache operations: FAILED!
> > root@machine:~/athira/linux/tools/perf# Couldn't synthesize bpf events.
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.010 MB /tmp/perf.data.bFF ]
> > <<>>
> >
> > Fix this by redirecting output instead of using “&” which
> > starts the command in background.
> >
> > Signed-off-by: Athira Rajeev 
> > ---
> >  tools/perf/tests/shell/buildid.sh | 16 +---
> >  1 file changed, 9 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/perf/tests/shell/buildid.sh 
> > b/tools/perf/tests/shell/buildid.sh
> > index f05670d1e39e..3512c4423d48 100755
> > --- a/tools/perf/tests/shell/buildid.sh
> > +++ b/tools/perf/tests/shell/buildid.sh
> > @@ -66,7 +66,7 @@ check()
> >   esac
> >   echo "build id: ${id}"
> >
> > - link=${build_id_dir}/.build-id/${id:0:2}/${id:2}
> > + link=${build_id_dir}/.build-id/$(echo ${id}|cut -c 1-2)/$(echo 
> > ${id}|cut -c 3-)
> >   echo "link: ${link}"
> >
> >   if [ ! -h $link ]; then
> > @@ -74,7 +74,7 @@ check()
> >   exit 1
> >   fi
> >
> > - file=${build_id_dir}/.build-id/${id:0:2}/`readlink ${link}`/elf
> > + file=${build_id_dir}/.build-id/$(echo ${id}|cut -c 1-2)/`readlink 
> > ${link}`/elf
> >   echo "file: ${file}"
> >
> >   if [ ! -x $file ]; then
> > @@ -117,20 +117,22 @@ test_record()
> >  {
> >   data=$(mktemp /tmp/perf.data.XXX)
> >   build_id_dir=$(mktemp -d /tmp/perf.debug.XXX)
> > - log=$(mktemp /tmp/perf.log.XXX)
> > + log_out=$(mktemp /tmp/perf.log.out.XXX)
> > + log_err=$(mktemp /tmp/perf.log.err.XXX)
> >   perf="perf --buildid-dir ${build_id_dir}"
> > + eval last=\${$#}
> >
> >   echo "running: perf record $@"
> > - ${perf} record --buildid-all -o ${data} $@ &

Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value

2022-10-01 Thread Ian Rogers
On Thu, Sep 29, 2022 at 5:56 AM James Clark  wrote:
>
>
>
> On 29/09/2022 09:49, Athira Rajeev wrote:
> >
> >
> >> On 28-Sep-2022, at 9:05 PM, James Clark  wrote:
> >>
> >>
> >>
> >
> > Hi James,
> >
> > Thanks for looking at the patch and sharing review comments.
> >
> >> On 13/09/2022 12:57, Athira Rajeev wrote:
> >>> perf stat includes option to specify aggr_mode to display
> >>> per-socket, per-core, per-die, per-node counter details.
> >>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
> >>> counter values are displayed for each cpu along with "CPU"
> >>> value in one field of the output.
> >>>
> >>> Each of the aggregate mode uses the information fetched
> >>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
> >>
> >> I thought that this wouldn't apply to the cpu field because cpu is
> >> basically interchangeable as an index in cpumap, rather than anything
> >> being read from the topology file.
> >
> > The cpu value is filled in this function:
> >
> > Function : aggr_cpu_id__cpu
> > Code: util/cpumap.c
> >
> >>
> >>> physical_package_id. Utility functions in "cpumap.c" fetches
> >>> this information and populates the socket id, core id, cpu etc.
> >>> If the platform does not expose the topology information,
> >>> these values will be set to -1. Example, in case of powerpc,
> >>> details like physical_package_id is restricted to be exposed
> >>> in pSeries platform. So id.socket, id.core, id.cpu all will
> >>> be set as -1.
> >>>
> >>> In case of displaying socket or die value, there is no check
> >>> done in the "aggr_printout" function to see if it points to
> >>> valid socket id or die. But for displaying "cpu" value, there
> >>> is a check for "if (id.core > -1)". In case of powerpc pSeries
> >>> where detail like physical_package_id is restricted to be
> >>> exposed, id.core will be set to -1. Hence the column or field
> >>> itself for CPU won't be displayed in the output.
> >>>
> >>> Result for per-socket:
> >>>
> >>> <<>>
> >>> perf stat -e branches --per-socket -a true
> >>>
> >>> Performance counter stats for 'system wide':
> >>>
> >>> S-1  32416,851  branches
> >>> <<>>
> >>>
> >>> Here S has -1 in above result. But with -A option which also
> >>> expects CPU in one column in the result, below is observed.
> >>>
> >>> <<>>
> >>> /bin/perf stat -e instructions -A -a true
> >>>
> >>> Performance counter stats for 'system wide':
> >>>
> >>>47,146  instructions
> >>>45,226  instructions
> >>>43,354  instructions
> >>>45,184  instructions
> >>> <<>>
> >>>
> >>> If the cpu id value is pointing to -1 also, it makes sense
> >>> to display the column in the output to replicate the behaviour
> >>> or to be in precedence with other aggr options(like per-socket,
> >>> per-core). Remove the check "id.core" so that CPU field gets
> >>> displayed in the output.
> >>
> >> Why would you want to print -1 out? Seems like the if statement was a
> >> good one to me, otherwise the output looks a bit broken to users. Are
> >> the other aggregation modes even working if -1 is set for socket and
> >> die? Maybe we need to not print -1 in those cases or exit earlier with a
> >> failure.
> >>
> >> The -1 value has a specific internal meaning which is "to not
> >> aggregate". It doesn't mean "not set".
> >
> > Currently, this check is done only for printing cpu value.
> > For socket/die/core values, this check is not done. Pasting an
> > example snippet from a powerpc system ( specifically from pseries platform 
> > where
> > the value is set to -1 )
> >
> > ./perf stat --per-core -a -C 1 true
> >
> >  Performance counter stats for 'system wide':
> >
> > S-1-D-1-C-1  1   1.06 msec cpu-clock
> > #1.018 CPUs utilized
> > S-1-D-1-C-1  1  2  context-switches 
> > #1.879 K/sec
> > S-1-D-1-C-1  1  0  cpu-migrations   
> > #0.000 /sec
> >
> > Here though the value is -1, we are displaying it. Where as in case of cpu, 
> > the first column will be
> > empty since we do a check before printing.
> >
> > Example:
> >
> > ./perf stat --per-core -A -C 1 true
> >
> >  Performance counter stats for 'CPU(s) 1':
> >
> >   0.88 msec cpu-clock#1.022 CPUs 
> > utilized
> >  2  context-switches
> >  0  cpu-migrations
> >
> >
> > No sure, whether there are scripts out there, which consume the current 
> > format and
> > not displaying -1 may break it. That is why we tried with change to remove 
> > check for cpu, similar to
> > other modes like socket, die, core etc.
>
> I wouldn't worry about that because there are json and CSV modes which
> are machine readable, and -1 is already not always displayed. If
> anything this change here is also likely to break parsing by adding -1
> where it wasn't before.
>
> >
> > 

Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value

2022-10-03 Thread Ian Rogers
On Mon, Oct 3, 2022 at 7:03 AM atrajeev  wrote:
>
> On 2022-10-02 05:17, Ian Rogers wrote:
> > On Thu, Sep 29, 2022 at 5:56 AM James Clark 
> > wrote:
> >>
> >>
> >>
> >> On 29/09/2022 09:49, Athira Rajeev wrote:
> >> >
> >> >
> >> >> On 28-Sep-2022, at 9:05 PM, James Clark  wrote:
> >> >>
> >> >>
> >> >>
> >> >
> >> > Hi James,
> >> >
> >> > Thanks for looking at the patch and sharing review comments.
> >> >
> >> >> On 13/09/2022 12:57, Athira Rajeev wrote:
> >> >>> perf stat includes option to specify aggr_mode to display
> >> >>> per-socket, per-core, per-die, per-node counter details.
> >> >>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
> >> >>> counter values are displayed for each cpu along with "CPU"
> >> >>> value in one field of the output.
> >> >>>
> >> >>> Each of the aggregate mode uses the information fetched
> >> >>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
> >> >>
> >> >> I thought that this wouldn't apply to the cpu field because cpu is
> >> >> basically interchangeable as an index in cpumap, rather than anything
> >> >> being read from the topology file.
> >> >
> >> > The cpu value is filled in this function:
> >> >
> >> > Function : aggr_cpu_id__cpu
> >> > Code: util/cpumap.c
> >> >
> >> >>
> >> >>> physical_package_id. Utility functions in "cpumap.c" fetches
> >> >>> this information and populates the socket id, core id, cpu etc.
> >> >>> If the platform does not expose the topology information,
> >> >>> these values will be set to -1. Example, in case of powerpc,
> >> >>> details like physical_package_id is restricted to be exposed
> >> >>> in pSeries platform. So id.socket, id.core, id.cpu all will
> >> >>> be set as -1.
> >> >>>
> >> >>> In case of displaying socket or die value, there is no check
> >> >>> done in the "aggr_printout" function to see if it points to
> >> >>> valid socket id or die. But for displaying "cpu" value, there
> >> >>> is a check for "if (id.core > -1)". In case of powerpc pSeries
> >> >>> where detail like physical_package_id is restricted to be
> >> >>> exposed, id.core will be set to -1. Hence the column or field
> >> >>> itself for CPU won't be displayed in the output.
> >> >>>
> >> >>> Result for per-socket:
> >> >>>
> >> >>> <<>>
> >> >>> perf stat -e branches --per-socket -a true
> >> >>>
> >> >>> Performance counter stats for 'system wide':
> >> >>>
> >> >>> S-1  32416,851  branches
> >> >>> <<>>
> >> >>>
> >> >>> Here S has -1 in above result. But with -A option which also
> >> >>> expects CPU in one column in the result, below is observed.
> >> >>>
> >> >>> <<>>
> >> >>> /bin/perf stat -e instructions -A -a true
> >> >>>
> >> >>> Performance counter stats for 'system wide':
> >> >>>
> >> >>>47,146  instructions
> >> >>>45,226  instructions
> >> >>>43,354  instructions
> >> >>>45,184  instructions
> >> >>> <<>>
> >> >>>
> >> >>> If the cpu id value is pointing to -1 also, it makes sense
> >> >>> to display the column in the output to replicate the behaviour
> >> >>> or to be in precedence with other aggr options(like per-socket,
> >> >>> per-core). Remove the check "id.core" so that CPU field gets
> >> >>> displayed in the output.
> >> >>
> >> >> Why would you want to print -1 out? Seems like the if statement was a
> >> >> good one to me, otherwise the output looks a bit broken to users. Are
> >> >> the other aggregation modes even wor

Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value

2022-10-04 Thread Ian Rogers
On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev 
wrote:

>
>
> > On 04-Oct-2022, at 12:21 AM, Ian Rogers  wrote:
> >
> > On Mon, Oct 3, 2022 at 7:03 AM atrajeev 
> wrote:
> >>
> >> On 2022-10-02 05:17, Ian Rogers wrote:
> >>> On Thu, Sep 29, 2022 at 5:56 AM James Clark 
> >>> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 29/09/2022 09:49, Athira Rajeev wrote:
> >>>>>
> >>>>>
> >>>>>> On 28-Sep-2022, at 9:05 PM, James Clark 
> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> Hi James,
> >>>>>
> >>>>> Thanks for looking at the patch and sharing review comments.
> >>>>>
> >>>>>> On 13/09/2022 12:57, Athira Rajeev wrote:
> >>>>>>> perf stat includes option to specify aggr_mode to display
> >>>>>>> per-socket, per-core, per-die, per-node counter details.
> >>>>>>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
> >>>>>>> counter values are displayed for each cpu along with "CPU"
> >>>>>>> value in one field of the output.
> >>>>>>>
> >>>>>>> Each of the aggregate mode uses the information fetched
> >>>>>>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
> >>>>>>
> >>>>>> I thought that this wouldn't apply to the cpu field because cpu is
> >>>>>> basically interchangeable as an index in cpumap, rather than
> anything
> >>>>>> being read from the topology file.
> >>>>>
> >>>>> The cpu value is filled in this function:
> >>>>>
> >>>>> Function : aggr_cpu_id__cpu
> >>>>> Code: util/cpumap.c
> >>>>>
> >>>>>>
> >>>>>>> physical_package_id. Utility functions in "cpumap.c" fetches
> >>>>>>> this information and populates the socket id, core id, cpu etc.
> >>>>>>> If the platform does not expose the topology information,
> >>>>>>> these values will be set to -1. Example, in case of powerpc,
> >>>>>>> details like physical_package_id is restricted to be exposed
> >>>>>>> in pSeries platform. So id.socket, id.core, id.cpu all will
> >>>>>>> be set as -1.
> >>>>>>>
> >>>>>>> In case of displaying socket or die value, there is no check
> >>>>>>> done in the "aggr_printout" function to see if it points to
> >>>>>>> valid socket id or die. But for displaying "cpu" value, there
> >>>>>>> is a check for "if (id.core > -1)". In case of powerpc pSeries
> >>>>>>> where detail like physical_package_id is restricted to be
> >>>>>>> exposed, id.core will be set to -1. Hence the column or field
> >>>>>>> itself for CPU won't be displayed in the output.
> >>>>>>>
> >>>>>>> Result for per-socket:
> >>>>>>>
> >>>>>>> <<>>
> >>>>>>> perf stat -e branches --per-socket -a true
> >>>>>>>
> >>>>>>> Performance counter stats for 'system wide':
> >>>>>>>
> >>>>>>> S-1  32416,851  branches
> >>>>>>> <<>>
> >>>>>>>
> >>>>>>> Here S has -1 in above result. But with -A option which also
> >>>>>>> expects CPU in one column in the result, below is observed.
> >>>>>>>
> >>>>>>> <<>>
> >>>>>>> /bin/perf stat -e instructions -A -a true
> >>>>>>>
> >>>>>>> Performance counter stats for 'system wide':
> >>>>>>>
> >>>>>>>   47,146  instructions
> >>>>>>>   45,226  instructions
> >>>>>>>   43,354  instructions
> >>>>>>>   45,184  instructions
> >>>>>>> <<>>
> &

Re: [PATCH] perf test record+probe_libc_inet_pton: Fix call chain match on powerpc

2023-11-28 Thread Ian Rogers
> probe libc's inet_pton & backtrace it with ping: Ok
> >
> > Signed-off-by: Likhitha Korrapati 
> > Reported-by: Disha Goel 
>
> Thanks for the fix patch.
> I have tested on a Power10 machine, "probe libc's inet_pton & backtrace it 
> with ping"
> perf test passes with the patch applied.
>
> Output where gaih_inet function is not present
>
> # perf test -v "probe libc's inet_pton & backtrace it with ping"
>  85: probe libc's inet_pton & backtrace it with ping :
> --- start ---
> test child forked, pid 4622
> ping 4652 [011] 58.987631: probe_libc:inet_pton: (7fff91b79a60)
> 7fff91b79a60 __GI___inet_pton+0x0 
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
> 7fff91b2a73c getaddrinfo+0x121c 
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
> 119e53534 [unknown] (/usr/bin/ping)
> test child finished with 0
>  end 
> probe libc's inet_pton & backtrace it with ping: Ok
>
> Output where gaih_inet function is present
>
> # ./perf test -v "probe libc's inet_pton & backtrace it with ping"
>  83: probe libc's inet_pton & backtrace it with ping :
> --- start ---
> test child forked, pid 84831
> ping 84861 [000] 79056.019971: probe_libc:inet_pton: (7fff957631e8)
> 7fff957631e8 __GI___inet_pton+0x8 
> (/usr/lib64/glibc-hwcaps/power9/libc-2.28.so)
> 7fff95718760 gaih_inet.constprop.6+0xa90 
> (/usr/lib64/glibc-hwcaps/power9/libc-2.28.so)
> 7fff95719974 getaddrinfo+0x164 
> (/usr/lib64/glibc-hwcaps/power9/libc-2.28.so)
> 122e732a4 [unknown] (/usr/bin/ping)
> test child finished with 0
>  end 
> probe libc's inet_pton & backtrace it with ping: Ok
>
> Tested-by: Disha Goel 

Reviewed-by: Ian Rogers 

Thanks,
Ian

>
> > ---
> >   tools/perf/tests/shell/record+probe_libc_inet_pton.sh | 5 -
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh 
> > b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
> > index eebeea6bdc76..72c65570db37 100755
> > --- a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
> > +++ b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
> > @@ -45,7 +45,10 @@ trace_libc_inet_pton_backtrace() {
> >   ;;
> >   ppc64|ppc64le)
> >   eventattr='max-stack=4'
> > - echo "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\($libc\)$" >> 
> > $expected
> > + # Add gaih_inet to expected backtrace only if it is part of 
> > libc.
> > + if nm $libc | grep -F -q gaih_inet.; then
> > + echo 
> > "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\($libc\)$" >> $expected
> > + fi
> >   echo "getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\($libc\)$" >> 
> > $expected
> >   echo 
> > ".*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$" >> 
> > $expected
> >   ;;


Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events

2023-12-04 Thread Ian Rogers
On Thu, Nov 23, 2023 at 8:01 AM Athira Rajeev
 wrote:
>
> Running "perf list" on powerpc fails with segfault
> as below:
>
>./perf list
>Segmentation fault (core dumped)
>
> This happens because of duplicate events in the json list.
> The powerpc Json event list contains some event with same
> event name, but different event code. They are:
> - PM_INST_FROM_L3MISS (Present in datasource and frontend)
> - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
> - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
> - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)
>
> pmu_events_table__num_events uses the value from
> table_pmu->num_entries which includes duplicate events as
> well. This causes issue during "perf list" and results in
> segmentation fault.
>
> Since both event codes are valid, append _DSRC to the Data
> Source events (datasource.json), so that they would have a
> unique name. Also add PM_DATA_FROM_L2MISS_DSRC and
> PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list
> works as expected.
>
> Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events")
> Signed-off-by: Athira Rajeev 

Given duplicate events creates broken pmu-events.c we should capture
that as an exception in jevents.py. That way a JEVENTS_ARCH=all build
will fail if any vendor/architecture would break in this way. We
should also add JEVENTS_ARCH=all to tools/perf/tests/make. Athira, do
you want to look at doing this?

Thanks,
Ian

> ---
>  .../arch/powerpc/power10/datasource.json   | 18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json 
> b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
> index 6b0356f2d301..0eeaaf1a95b8 100644
> --- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
> +++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
> @@ -99,6 +99,11 @@
>  "EventName": "PM_INST_FROM_L2MISS",
>  "BriefDescription": "The processor's instruction cache was reloaded from 
> a source beyond the local core's L2 due to a demand miss."
>},
> +  {
> +"EventCode": "0x0003C000C040",
> +"EventName": "PM_DATA_FROM_L2MISS_DSRC",
> +"BriefDescription": "The processor's L1 data cache was reloaded from a 
> source beyond the local core's L2 due to a demand miss."
> +  },
>{
>  "EventCode": "0x00038010C040",
>  "EventName": "PM_INST_FROM_L2MISS_ALL",
> @@ -161,9 +166,14 @@
>},
>{
>  "EventCode": "0x00078000C040",
> -"EventName": "PM_INST_FROM_L3MISS",
> +"EventName": "PM_INST_FROM_L3MISS_DSRC",
>  "BriefDescription": "The processor's instruction cache was reloaded from 
> beyond the local core's L3 due to a demand miss."
>},
> +  {
> +"EventCode": "0x0007C000C040",
> +"EventName": "PM_DATA_FROM_L3MISS_DSRC",
> +"BriefDescription": "The processor's L1 data cache was reloaded from 
> beyond the local core's L3 due to a demand miss."
> +  },
>{
>  "EventCode": "0x00078010C040",
>  "EventName": "PM_INST_FROM_L3MISS_ALL",
> @@ -981,7 +991,7 @@
>},
>{
>  "EventCode": "0x0003C000C142",
> -"EventName": "PM_MRK_DATA_FROM_L2MISS",
> +"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC",
>  "BriefDescription": "The processor's L1 data cache was reloaded from a 
> source beyond the local core's L2 due to a demand miss for a marked 
> instruction."
>},
>{
> @@ -1046,12 +1056,12 @@
>},
>{
>  "EventCode": "0x00078000C142",
> -"EventName": "PM_MRK_INST_FROM_L3MISS",
> +"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC",
>  "BriefDescription": "The processor's instruction cache was reloaded from 
> beyond the local core's L3 due to a demand miss for a marked instruction."
>},
>{
>  "EventCode": "0x0007C000C142",
> -"EventName": "PM_MRK_DATA_FROM_L3MISS",
> +"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC",
>  "BriefDescription": "The processor's L1 data cache was reloaded from 
> beyond the local core's L3 due to a demand miss for a marked instruction."
>},
>{
> --
> 2.39.3
>


Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events

2023-12-04 Thread Ian Rogers
On Mon, Dec 4, 2023 at 12:22 PM Arnaldo Carvalho de Melo
 wrote:
>
> Em Mon, Dec 04, 2023 at 05:20:46PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Mon, Dec 04, 2023 at 12:12:54PM -0800, Ian Rogers escreveu:
> > > On Thu, Nov 23, 2023 at 8:01 AM Athira Rajeev
> > >  wrote:
> > > >
> > > > Running "perf list" on powerpc fails with segfault
> > > > as below:
> > > >
> > > >./perf list
> > > >Segmentation fault (core dumped)
> > > >
> > > > This happens because of duplicate events in the json list.
> > > > The powerpc Json event list contains some event with same
> > > > event name, but different event code. They are:
> > > > - PM_INST_FROM_L3MISS (Present in datasource and frontend)
> > > > - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
> > > > - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
> > > > - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)
> > > >
> > > > pmu_events_table__num_events uses the value from
> > > > table_pmu->num_entries which includes duplicate events as
> > > > well. This causes issue during "perf list" and results in
> > > > segmentation fault.
> > > >
> > > > Since both event codes are valid, append _DSRC to the Data
> > > > Source events (datasource.json), so that they would have a
> > > > unique name. Also add PM_DATA_FROM_L2MISS_DSRC and
> > > > PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list
> > > > works as expected.
> > > >
> > > > Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events")
> > > > Signed-off-by: Athira Rajeev 
> > >
> > > Given duplicate events creates broken pmu-events.c we should capture
> > > that as an exception in jevents.py. That way a JEVENTS_ARCH=all build
> > > will fail if any vendor/architecture would break in this way. We
> > > should also add JEVENTS_ARCH=all to tools/perf/tests/make. Athira, do
> > > you want to look at doing this?
> >
> > Should I go ahead and remove this patch till this is sorted out?
>
> I'll keep it, its already in tmp.perf-tools-next, we can go from there
> and improve this with follow up patches,

Agreed. I could look to do the follow up but likely won't have a
chance for a while. If others could help out it would be great. I'd
like to have the jevents and json be robust enough that we don't trip
over problems like this and the somewhat similar AmpereOne issue.

Thanks,
Ian

> - Arnaldo


Re: [PATCH V4] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-12-05 Thread Ian Rogers
On Tue, Dec 5, 2023 at 1:50 PM Arnaldo Carvalho de Melo  wrote:
>
> Em Mon, Nov 27, 2023 at 11:12:57AM +, James Clark escreveu:
> > On 23/11/2023 16:02, Athira Rajeev wrote:
> > > --- a/tools/perf/Makefile.perf
> > > @@ -1134,6 +1152,7 @@ bpf-skel-clean:
> > > $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
> > >
> > >  clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean 
> > > $(LIBSYMBOL)-clean $(LIBPERF)-clean fixdep-clean python-clean 
> > > bpf-skel-clean tests-coresight-targets-clean
> > > +   $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean
> > > $(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) 
> > > $(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS)
> > > $(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' 
> > > -delete -o -name '\.*.d' -delete
> > > $(Q)$(RM) $(OUTPUT).config-detected
>
> While merging perf-tools-next with torvalds/master I noticed that maybe
> we better have the above added line as:
>
> +   $(call QUIET_CLEAN, tests) $(Q)$(MAKE) -f 
> $(srctree)/tools/perf/tests/Makefile.tests clean
>
> No?
>
> Anyway I'm merging as-is, but it just hit my eye while merging,
>
> - Arnaldo

Makefile.tests was removed in these recent patches adding support for
the OUTPUT directory:
https://lore.kernel.org/lkml/9c33887f-8a88-4973-8593-7936e36af...@linux.vnet.ibm.com/

Thanks,
Ian


Re: [PATCH v2 0/9] jevents/pmu-events improvements

2023-01-19 Thread Ian Rogers
On Wed, Dec 21, 2022 at 2:34 PM Ian Rogers  wrote:
>
> Add an optimization to jevents using the metric code, rewrite metrics
> in terms of each other in order to minimize size and improve
> readability. For example, on Power8
> other_stall_cpi is rewritten from:
> "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / 
> PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU 
> / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - 
> PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / 
> PM_RUN_INST_CMPL"
> to:
> "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - 
> lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi"
> Which more closely matches the definition on Power9.
>
> A limitation of the substitutions are that they depend on strict
> equality and the shape of the tree. This means that for "a + b + c"
> then a substitution of "a + b" will succeed while "b + c" will fail
> (the LHS for "+ c" is "a + b" not just "b").
>
> Separate out the events and metrics in the pmu-events tables saving
> 14.8% in the table size while making it that metrics no longer need to
> iterate over all events and vice versa. These changes remove evsel's
> direct metric support as the pmu_event no longer has a metric to
> populate it. This is a minor issue as the code wasn't working
> properly, metrics for this are rare and can still be properly ran
> using '-M'.
>
> Add an ability to just build certain models into the jevents generated
> pmu-metrics.c code. This functionality is appropriate for operating
> systems like ChromeOS, that aim to minimize binary size and know all
> the target CPU models.
>
> v2. Rebase. Modify the code that skips rewriting a metric with the
> same name with itself, to make the name check case insensitive.
>
> Ian Rogers (9):
>   perf jevents metric: Correct Function equality
>   perf jevents metric: Add ability to rewrite metrics in terms of others
>   perf jevents: Rewrite metrics in the same file with each other
>   perf pmu-events: Separate metric out of pmu_event
>   perf stat: Remove evsel metric_name/expr
>   perf jevents: Combine table prefix and suffix writing
>   perf pmu-events: Introduce pmu_metrics_table
>   perf jevents: Generate metrics and events as separate tables
>   perf jevents: Add model list option

Ping. Looking for reviews.

Thanks,
Ian

>  tools/perf/arch/arm64/util/pmu.c |  23 +-
>  tools/perf/arch/powerpc/util/header.c|   4 +-
>  tools/perf/builtin-list.c|  20 +-
>  tools/perf/builtin-stat.c|   1 -
>  tools/perf/pmu-events/Build  |   3 +-
>  tools/perf/pmu-events/empty-pmu-events.c | 111 ++-
>  tools/perf/pmu-events/jevents.py | 353 ++-
>  tools/perf/pmu-events/metric.py  |  79 -
>  tools/perf/pmu-events/metric_test.py |  10 +
>  tools/perf/pmu-events/pmu-events.h   |  26 +-
>  tools/perf/tests/expand-cgroup.c |   4 +-
>  tools/perf/tests/parse-metric.c  |   4 +-
>  tools/perf/tests/pmu-events.c|  68 ++---
>  tools/perf/util/cgroup.c |   1 -
>  tools/perf/util/evsel.c  |   2 -
>  tools/perf/util/evsel.h  |   2 -
>  tools/perf/util/metricgroup.c| 203 +++--
>  tools/perf/util/metricgroup.h|   4 +-
>  tools/perf/util/parse-events.c   |   2 -
>  tools/perf/util/pmu.c|  44 +--
>  tools/perf/util/pmu.h|  10 +-
>  tools/perf/util/print-events.c   |  32 +-
>  tools/perf/util/print-events.h   |   3 +-
>  tools/perf/util/python.c |   7 -
>  tools/perf/util/stat-shadow.c| 112 ---
>  tools/perf/util/stat.h   |   1 -
>  26 files changed, 666 insertions(+), 463 deletions(-)
>
> --
> 2.39.0.314.g84b9a713c41-goog
>


Re: [PATCH v2 4/9] perf pmu-events: Separate metric out of pmu_event

2023-01-23 Thread Ian Rogers
On Mon, Jan 23, 2023 at 7:16 AM John Garry  wrote:
>
> On 21/12/2022 22:34, Ian Rogers wrote:
> > Previously both events and metrics were encoded in struct
> > pmu_event. Create a new pmu_metric that has the metric related
> > variables and remove these from pmu_event. Add iterators for
> > pmu_metric and use in places that metrics are desired rather than
> > events.
> >
> > Note, this change removes the setting of evsel's metric_name/expr as
> > these fields are no longer part of struct pmu_event. The metric
> > remains but is no longer implicitly requested when the event is. This
> > impacts a few Intel uncore events, however, as the ScaleUnit is shared
> > by the event and the metric this utility is questionable. Also the
> > MetricNames look broken (contain spaces) in some cases and when trying
> > to use the functionality with '-e' the metrics fail but regular
> > metrics with '-M' work. For example, on SkylakeX '-M' works:
> >
>
> I like this change. It's quite large for a single patch. Just some
> sparse comments below.
>
> BTW, it there a better name for metric struct variable than "pm"? To me
> and many other people, pm is power management.

Agreed. There a few things like that in the code, I dislike the
overload on "core". I've left it as pm as pmu_event became pe, so it
is consistent for pmu_metric to become pm. We can do a global rename
as a follow up.

> > ```
> > $ perf stat -M LLC_MISSES.PCIE_WRITE -a sleep 1
> >
> >   Performance counter stats for 'system wide':
> >
> >   0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 #  57896.0 
> > Bytes  LLC_MISSES.PCIE_WRITE  (49.84%)
> >   7,174  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1
> > (49.85%)
> >   0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3
> > (50.16%)
> >  63  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0
> > (50.15%)
> >
> > 1.004576381 seconds time elapsed
> > ```
> >
> > whilst the event '-e' version is broken even with --group/-g (fwiw, we 
> > should also remove -g [1]):
> >
> > ```
> > $ perf stat -g -e LLC_MISSES.PCIE_WRITE -g -a sleep 1
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
> > expression for LLC_MISSES.PCIE_WRITE
> > Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART

Re: [PATCH v2 7/9] perf pmu-events: Introduce pmu_metrics_table

2023-01-23 Thread Ian Rogers
On Mon, Jan 23, 2023 at 7:36 AM John Garry  wrote:
>
> On 21/12/2022 22:34, Ian Rogers wrote:
> > Add a metrics table that is just a cast from pmu_events_table. This
> > changes the APIs so that event and metric usage of the underlying
> > table is different. Later changes will separate the tables.
> >
> > This introduction fixes a NO_JEVENTS=1 regression on:
> >   68: Parse and process metrics   : Ok
> >   70: Event expansion for cgroups : Ok
> > caused by the necessary test metrics not being found.
> >
>
> I have just checked some of this code so far...
>
> > Signed-off-by: Ian Rogers 
> > ---
> >   tools/perf/arch/arm64/util/pmu.c | 23 ++-
> >   tools/perf/pmu-events/empty-pmu-events.c | 52 
> >   tools/perf/pmu-events/jevents.py | 24 ---
> >   tools/perf/pmu-events/pmu-events.h   | 10 +++--
> >   tools/perf/tests/expand-cgroup.c |  4 +-
> >   tools/perf/tests/parse-metric.c  |  4 +-
> >   tools/perf/tests/pmu-events.c|  5 ++-
> >   tools/perf/util/metricgroup.c| 50 +++
> >   tools/perf/util/metricgroup.h|  2 +-
> >   tools/perf/util/pmu.c|  9 +++-
> >   tools/perf/util/pmu.h|  1 +
> >   11 files changed, 133 insertions(+), 51 deletions(-)
> >
> > diff --git a/tools/perf/arch/arm64/util/pmu.c 
> > b/tools/perf/arch/arm64/util/pmu.c
> > index 477e513972a4..f8ae479a06db 100644
> > --- a/tools/perf/arch/arm64/util/pmu.c
> > +++ b/tools/perf/arch/arm64/util/pmu.c
> > @@ -19,7 +19,28 @@ const struct pmu_events_table 
> > *pmu_events_table__find(void)
> >   if (pmu->cpus->nr != cpu__max_cpu().cpu)
> >   return NULL;
> >
> > - return perf_pmu__find_table(pmu);
> > + return perf_pmu__find_events_table(pmu);
> > + }
> > +
> > + return NULL;
> > +}
> > +
> > +const struct pmu_metrics_table *pmu_metrics_table__find(void)
> > +{
> > + struct perf_pmu *pmu = NULL;
> > +
> > + while ((pmu = perf_pmu__scan(pmu))) {
> > + if (!is_pmu_core(pmu->name))
> > + continue;
> > +
> > + /*
> > +  * The cpumap should cover all CPUs. Otherwise, some CPUs may
> > +  * not support some events or have different event IDs.
> > +  */
> > + if (pmu->cpus->nr != cpu__max_cpu().cpu)
> > + return NULL;
> > +
> > + return perf_pmu__find_metrics_table(pmu);
>
> I think that this code will be conflicting with the recent arm64 metric
> support. And now it seems even more scope for factoring out code.

v3 will rebase and fix.

> >   }
> >
> >   return NULL;
> > diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
> > b/tools/perf/pmu-events/empty-pmu-events.c
> > index 5572a4d1eddb..d50f60a571dd 100644
> > --- a/tools/perf/pmu-events/empty-pmu-events.c
> > +++ b/tools/perf/pmu-events/empty-pmu-events.c
> > @@ -278,14 +278,12 @@ int pmu_events_table_for_each_event(const struct 
> > pmu_events_table *table, pmu_ev
> >   return 0;
> >   }
> >
> > -int pmu_events_table_for_each_metric(const struct pmu_events_table 
> > *etable, pmu_metric_iter_fn fn,
> > -  void *data)
> > +int pmu_metrics_table_for_each_metric(const struct pmu_metrics_table 
> > *table, pmu_metric_iter_fn fn,
> > +   void *data)
> >   {
> > - struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
> > -
> >   for (const struct pmu_metric *pm = &table->entries[0]
>
> nit on coding style: do we normally declare local variables like this?
> It condenses the code but makes a bit less readable, IMHO

The main reason to do it is to reduce the scope of pm to just be the
loop body. There's some discussion relating to this to do with the
move to C11:
https://lwn.net/Articles/885941/

> > ; pm->metric_group || pm->metric_name;
> >pm++) {
> > - int ret = fn(pm, etable, data);
> > + int ret = fn(pm, table, data);
> >
> >   if (ret)
> >   return ret;
> > @@ -293,7 +291,7 @@ int pmu_events_table_for_each_metric(const struct 
> > pmu_events_table *etable, pmu_
> >   return 0;
> 

Re: [PATCH v2 8/9] perf jevents: Generate metrics and events as separate tables

2023-01-23 Thread Ian Rogers
On Mon, Jan 23, 2023 at 7:18 AM John Garry  wrote:
>
> On 21/12/2022 22:34, Ian Rogers wrote:
> > Turn a perf json event into an event, metric or both. This reduces the
> > number of events needed to scan to find an event or metric. As events
> > no longer need the relatively seldom used metric fields, 4 bytes is
> > saved per event. This reduces the big C string's size by 335kb (14.8%)
> > on x86.
> >
> > Signed-off-by: Ian Rogers
>
> It would have been good to show an example of how the output changes. I
> could not apply the series (see cover), and knowing what to expect makes
> reviewing the code easier...
>
> Thanks,
> John

Thanks, will add in v3.

Ian


Re: [PATCH v2 0/9] jevents/pmu-events improvements

2023-01-23 Thread Ian Rogers
On Mon, Jan 23, 2023 at 5:26 AM John Garry  wrote:
>
> On 21/12/2022 22:34, Ian Rogers wrote:
> > Add an optimization to jevents using the metric code, rewrite metrics
> > in terms of each other in order to minimize size and improve
> > readability. For example, on Power8
> > other_stall_cpi is rewritten from:
> > "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / 
> > PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - 
> > PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / 
> > PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - 
> > PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
> > to:
> > "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - 
> > lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi"
> > Which more closely matches the definition on Power9.
> >
> > A limitation of the substitutions are that they depend on strict
> > equality and the shape of the tree. This means that for "a + b + c"
> > then a substitution of "a + b" will succeed while "b + c" will fail
> > (the LHS for "+ c" is "a + b" not just "b").
> >
> > Separate out the events and metrics in the pmu-events tables saving
> > 14.8% in the table size while making it that metrics no longer need to
> > iterate over all events and vice versa. These changes remove evsel's
> > direct metric support as the pmu_event no longer has a metric to
> > populate it. This is a minor issue as the code wasn't working
> > properly, metrics for this are rare and can still be properly ran
> > using '-M'.
> >
> > Add an ability to just build certain models into the jevents generated
> > pmu-metrics.c code. This functionality is appropriate for operating
> > systems like ChromeOS, that aim to minimize binary size and know all
> > the target CPU models.
>
>  From a glance, this does not look like it would work for arm64. As I
> see in the code, we check the model in the arch folder for the test to
> see if built. For arm64, as it uses arch/implementator/model folder org,
> and not just arch/model (like x86)
>
> So on the assumption that it does not work for arm64 (or just any arch
> which uses arch/implementator/model folder org), it would be nice to
> have that feature also. Or maybe also support not just specifying model
> but also implementator.

Hmm.. this is tricky as x86 isn't following the implementor pattern. I
will tweak the comment for the ARM64 case where --model will select an
implementor.

> >
> > v2. Rebase. Modify the code that skips rewriting a metric with the
> >  same name with itself, to make the name check case insensitive.
> >
>
>
> Unfortunately you might need another rebase as this does not apply to
> acme perf/core (if that is what you want), now for me at:
>
> 5670ebf54bd2 (HEAD, origin/tmp.perf/core, origin/perf/core, perf/core)
> perf cs-etm: Ensure that Coresight timestamps don't go backwards

Will do, thanks!
Ian

> > Ian Rogers (9):
> >perf jevents metric: Correct Function equality
> >perf jevents metric: Add ability to rewrite metrics in terms of others
> >perf jevents: Rewrite metrics in the same file with each other
> >perf pmu-events: Separate metric out of pmu_event
> >perf stat: Remove evsel metric_name/expr
> >perf jevents: Combine table prefix and suffix writing
> >perf pmu-events: Introduce pmu_metrics_table
> >perf jevents: Generate metrics and events as separate tables
> >perf jevents: Add model list option
>
> Thanks,
> John


[PATCH v3 00/11] jevents/pmu-events improvements

2023-01-23 Thread Ian Rogers
Add an optimization to jevents using the metric code, rewrite metrics
in terms of each other in order to minimize size and improve
readability. For example, on Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

A limitation of the substitutions are that they depend on strict
equality and the shape of the tree. This means that for "a + b + c"
then a substitution of "a + b" will succeed while "b + c" will fail
(the LHS for "+ c" is "a + b" not just "b").

Separate out the events and metrics in the pmu-events tables saving
14.8% in the table size while making it that metrics no longer need to
iterate over all events and vice versa. These changes remove evsel's
direct metric support as the pmu_event no longer has a metric to
populate it. This is a minor issue as the code wasn't working
properly, metrics for this are rare and can still be properly ran
using '-M'.

Add an ability to just build certain models into the jevents generated
pmu-metrics.c code. This functionality is appropriate for operating
systems like ChromeOS, that aim to minimize binary size and know all
the target CPU models.

v3. Rebase an incorporate review comments from John Garry
, in particular breaking apart patch 4
into 3 patches. The no jevents breakage and then later fix is
avoided in this series too.
v2. Rebase. Modify the code that skips rewriting a metric with the
same name with itself, to make the name check case insensitive.

Ian Rogers (11):
  perf jevents metric: Correct Function equality
  perf jevents metric: Add ability to rewrite metrics in terms of others
  perf jevents: Rewrite metrics in the same file with each other
  perf pmu-events: Add separate metric from pmu_event
  perf pmu-events: Separate the metrics from events for no jevents
  perf pmu-events: Remove now unused event and metric variables
  perf stat: Remove evsel metric_name/expr
  perf jevents: Combine table prefix and suffix writing
  perf pmu-events: Introduce pmu_metrics_table
  perf jevents: Generate metrics and events as separate tables
  perf jevents: Add model list option

 tools/perf/arch/arm64/util/pmu.c |  11 +-
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/builtin-list.c|  20 +-
 tools/perf/builtin-stat.c|   1 -
 tools/perf/pmu-events/Build  |   3 +-
 tools/perf/pmu-events/empty-pmu-events.c | 108 ++-
 tools/perf/pmu-events/jevents.py | 350 +++
 tools/perf/pmu-events/metric.py  |  79 -
 tools/perf/pmu-events/metric_test.py |  10 +
 tools/perf/pmu-events/pmu-events.h   |  26 +-
 tools/perf/tests/expand-cgroup.c |   4 +-
 tools/perf/tests/parse-metric.c  |   4 +-
 tools/perf/tests/pmu-events.c|  68 ++---
 tools/perf/util/cgroup.c |   1 -
 tools/perf/util/evsel.c  |   2 -
 tools/perf/util/evsel.h  |   2 -
 tools/perf/util/metricgroup.c| 203 +++--
 tools/perf/util/metricgroup.h|   4 +-
 tools/perf/util/parse-events.c   |   2 -
 tools/perf/util/pmu.c|  44 +--
 tools/perf/util/pmu.h|  10 +-
 tools/perf/util/print-events.c   |  32 +--
 tools/perf/util/print-events.h   |   3 +-
 tools/perf/util/python.c |   7 -
 tools/perf/util/stat-shadow.c| 112 
 tools/perf/util/stat.h   |   1 -
 26 files changed, 650 insertions(+), 461 deletions(-)

-- 
2.39.0.246.g2a6d74b583-goog



[PATCH v3 01/11] perf jevents metric: Correct Function equality

2023-01-23 Thread Ian Rogers
rhs may not be defined, say for source_count, so add a guard.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 4797ed4fd817..2f2fd220e843 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -261,8 +261,10 @@ class Function(Expression):
 
   def Equals(self, other: Expression) -> bool:
 if isinstance(other, Function):
-  return self.fn == other.fn and self.lhs.Equals(
-  other.lhs) and self.rhs.Equals(other.rhs)
+  result = self.fn == other.fn and self.lhs.Equals(other.lhs)
+  if self.rhs:
+result = result and self.rhs.Equals(other.rhs)
+  return result
 return False
 
 
-- 
2.39.0.246.g2a6d74b583-goog



[PATCH v3 02/11] perf jevents metric: Add ability to rewrite metrics in terms of others

2023-01-23 Thread Ian Rogers
Add RewriteMetricsInTermsOfOthers that iterates over pairs of names
and expressions trying to replace an expression, within the current
expression, with its name.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py  | 73 +++-
 tools/perf/pmu-events/metric_test.py | 10 
 2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 2f2fd220e843..ed13efac7389 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -4,7 +4,7 @@ import ast
 import decimal
 import json
 import re
-from typing import Dict, List, Optional, Set, Union
+from typing import Dict, List, Optional, Set, Tuple, Union
 
 
 class Expression:
@@ -26,6 +26,9 @@ class Expression:
 """Returns true when two expressions are the same."""
 raise NotImplementedError()
 
+  def Substitute(self, name: str, expression: 'Expression') -> 'Expression':
+raise NotImplementedError()
+
   def __str__(self) -> str:
 return self.ToPerfJson()
 
@@ -186,6 +189,15 @@ class Operator(Expression):
   other.lhs) and self.rhs.Equals(other.rhs)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Operator(self.operator, lhs, rhs)
+
 
 class Select(Expression):
   """Represents a select ternary in the parse tree."""
@@ -225,6 +237,14 @@ class Select(Expression):
   other.false_val) and self.true_val.Equals(other.true_val)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+true_val = self.true_val.Substitute(name, expression)
+cond = self.cond.Substitute(name, expression)
+false_val = self.false_val.Substitute(name, expression)
+return Select(true_val, cond, false_val)
+
 
 class Function(Expression):
   """A function in an expression like min, max, d_ratio."""
@@ -267,6 +287,15 @@ class Function(Expression):
   return result
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Function(self.fn, lhs, rhs)
+
 
 def _FixEscapes(s: str) -> str:
   s = re.sub(r'([^\\]),', r'\1\\,', s)
@@ -293,6 +322,9 @@ class Event(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Event) and self.name == other.name
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Constant(Expression):
   """A constant within the expression tree."""
@@ -317,6 +349,9 @@ class Constant(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Constant) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Literal(Expression):
   """A runtime literal within the expression tree."""
@@ -336,6 +371,9 @@ class Literal(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Literal) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 def min(lhs: Union[int, float, Expression], rhs: Union[int, float,
Expression]) -> 
Function:
@@ -461,6 +499,7 @@ class MetricGroup:
 
 
 class _RewriteIfExpToSelect(ast.NodeTransformer):
+  """Transformer to convert if-else nodes to Select expressions."""
 
   def visit_IfExp(self, node):
 # pylint: disable=invalid-name
@@ -498,7 +537,37 @@ def ParsePerfJson(orig: str) -> Expression:
   for kw in keywords:
 py = re.sub(rf'Event\(r"{kw}"\)', kw, py)
 
-  parsed = ast.parse(py, mode='eval')
+  try:
+parsed = ast.parse(py, mode='eval')
+  except SyntaxError as e:
+raise SyntaxError(f'Parsing expression:\n{orig}') from e
   _RewriteIfExpToSelect().visit(parsed)
   parsed = ast.fix_missing_locations(parsed)
   return _Constify(eval(compile(parsed, orig, 'eval')))
+
+
+def RewriteMetricsInTermsOfOthers(metrics: list[Tuple[str, Expression]]
+  )-> Dict[str, Expression]:
+  """Shorten metrics by rewriting in terms of others.
+
+  Args:
+metrics 

[PATCH v3 03/11] perf jevents: Rewrite metrics in the same file with each other

2023-01-23 Thread Ian Rogers
Rewrite metrics within the same file in terms of each other. For example, on 
Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

To avoid recomputation decorate the function with a cache.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 0416b7442171..15a1671740cc 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -3,6 +3,7 @@
 """Convert directories of JSON events to C code."""
 import argparse
 import csv
+from functools import lru_cache
 import json
 import metric
 import os
@@ -337,18 +338,28 @@ class JsonEvent:
 s = self.build_c_string()
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
-
+@lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events from the specified file."""
-
   try:
-result = json.load(open(path), object_hook=JsonEvent)
+events = json.load(open(path), object_hook=JsonEvent)
   except BaseException as err:
 print(f"Exception processing {path}")
 raise
-  for event in result:
+  metrics: list[Tuple[str, metric.Expression]] = []
+  for event in events:
 event.topic = topic
-  return result
+if event.metric_name and '-' not in event.metric_name:
+  metrics.append((event.metric_name, event.metric_expr))
+  updates = metric.RewriteMetricsInTermsOfOthers(metrics)
+  if updates:
+for event in events:
+  if event.metric_name in updates:
+# print(f'Updated {event.metric_name} from\n"{event.metric_expr}"\n'
+#   f'to\n"{updates[event.metric_name]}"')
+event.metric_expr = updates[event.metric_name]
+
+  return events
 
 def preprocess_arch_std_files(archpath: str) -> None:
   """Read in all architecture standard events."""
-- 
2.39.0.246.g2a6d74b583-goog



[PATCH v3 04/11] perf pmu-events: Add separate metric from pmu_event

2023-01-23 Thread Ian Rogers
Create a new pmu_metric for the metric related variables from
pmu_event but that is initially just a clone of pmu_event. Add
iterators for pmu_metric and use in places that metrics are desired
rather than events. Make the event iterator skip metric only events,
and the metric iterator skip event only events.

Signed-off-by: Ian Rogers 
---
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/pmu-events/empty-pmu-events.c |  49 ++-
 tools/perf/pmu-events/jevents.py |  62 -
 tools/perf/pmu-events/pmu-events.h   |  26 
 tools/perf/tests/pmu-events.c|  35 +++--
 tools/perf/util/metricgroup.c| 161 +++
 tools/perf/util/metricgroup.h|   2 +-
 7 files changed, 228 insertions(+), 111 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index e8fe36b10d20..78eef77d8a8d 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -40,11 +40,11 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
return bufp;
 }
 
-int arch_get_runtimeparam(const struct pmu_event *pe)
+int arch_get_runtimeparam(const struct pmu_metric *pm)
 {
int count;
char path[PATH_MAX] = "/devices/hv_24x7/interface/";
 
-   atoi(pe->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
+   atoi(pm->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
return sysfs__read_int(path, &count) < 0 ? 1 : count;
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 480e8f0d30c8..4e39d1a8d6d6 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -181,6 +181,11 @@ struct pmu_events_table {
const struct pmu_event *entries;
 };
 
+/* Struct used to make the PMU metric table implementation opaque to callers. 
*/
+struct pmu_metrics_table {
+   const struct pmu_metric *entries;
+};
+
 /*
  * Map a CPU to its table of PMU events. The CPU is identified by the
  * cpuid field, which is an arch-specific identifier for the CPU.
@@ -254,11 +259,29 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 int pmu_events_table_for_each_event(const struct pmu_events_table *table, 
pmu_event_iter_fn fn,
void *data)
 {
-   for (const struct pmu_event *pe = &table->entries[0];
-pe->name || pe->metric_group || pe->metric_name;
-pe++) {
-   int ret = fn(pe, table, data);
+   for (const struct pmu_event *pe = &table->entries[0]; pe->name || 
pe->metric_expr; pe++) {
+   int ret;
 
+   if (!pe->name)
+   continue;
+   ret = fn(pe, table, data);
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
+void *data)
+{
+   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
+
+   for (const struct pmu_metric *pm = &table->entries[0]; pm->name || 
pm->metric_expr; pm++) {
+   int ret;
+
+   if (!pm->metric_expr)
+   continue;
+   ret = fn(pm, etable, data);
if (ret)
return ret;
}
@@ -305,11 +328,22 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
 }
 
 int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data)
+{
+   for (const struct pmu_events_map *tables = &pmu_events_map[0]; 
tables->arch; tables++) {
+   int ret = pmu_events_table_for_each_event(&tables->table, fn, 
data);
+
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data)
 {
for (const struct pmu_events_map *tables = &pmu_events_map[0];
 tables->arch;
 tables++) {
-   int ret = pmu_events_table_for_each_event(&tables->table, fn, 
data);
+   int ret = pmu_events_table_for_each_metric(&tables->table, fn, 
data);
 
if (ret)
return ret;
@@ -340,3 +374,8 @@ int pmu_for_each_sys_event(pmu_event_iter_fn fn, void *data)
}
return 0;
 }
+
+int pmu_for_each_sys_metric(pmu_metric_iter_fn fn __maybe_unused, void *data 
__maybe_unused)
+{
+   return 0;
+}
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 15a1671740cc..858787a12302 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -564,7 +564,19 @@ static const struct 

[PATCH v3 05/11] perf pmu-events: Separate the metrics from events for no jevents

2023-01-23 Thread Ian Rogers
Separate the event and metric table when building without jevents. Add
find_core_metrics_table and perf_pmu__find_metrics_table while
renaming existing utilities to be event specific, so that users can
find the right table for their need.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/empty-pmu-events.c | 88 ++--
 tools/perf/pmu-events/jevents.py |  7 +-
 tools/perf/pmu-events/pmu-events.h   |  4 +-
 tools/perf/tests/expand-cgroup.c |  2 +-
 tools/perf/tests/parse-metric.c  |  2 +-
 tools/perf/util/pmu.c|  4 +-
 6 files changed, 79 insertions(+), 28 deletions(-)

diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 4e39d1a8d6d6..10bd4943ebf8 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-static const struct pmu_event pme_test_soc_cpu[] = {
+static const struct pmu_event pmu_events__test_soc_cpu[] = {
{
.name = "l3_cache_rd",
.event = "event=0x40",
@@ -105,6 +105,14 @@ static const struct pmu_event pme_test_soc_cpu[] = {
.desc = "L2 BTB Correction",
.topic = "branch",
},
+   {
+   .name = 0,
+   .event = 0,
+   .desc = 0,
+   },
+};
+
+static const struct pmu_metric pmu_metrics__test_soc_cpu[] = {
{
.metric_expr= "1 / IPC",
.metric_name= "CPI",
@@ -170,9 +178,8 @@ static const struct pmu_event pme_test_soc_cpu[] = {
.metric_name= "L1D_Cache_Fill_BW",
},
{
-   .name = 0,
-   .event = 0,
-   .desc = 0,
+   .metric_expr = 0,
+   .metric_name = 0,
},
 };
 
@@ -197,7 +204,8 @@ struct pmu_metrics_table {
 struct pmu_events_map {
const char *arch;
const char *cpuid;
-   const struct pmu_events_table table;
+   const struct pmu_events_table event_table;
+   const struct pmu_metrics_table metric_table;
 };
 
 /*
@@ -208,12 +216,14 @@ static const struct pmu_events_map pmu_events_map[] = {
{
.arch = "testarch",
.cpuid = "testcpu",
-   .table = { pme_test_soc_cpu },
+   .event_table = { pmu_events__test_soc_cpu },
+   .metric_table = { pmu_metrics__test_soc_cpu },
},
{
.arch = 0,
.cpuid = 0,
-   .table = { 0 },
+   .event_table = { 0 },
+   .metric_table = { 0 },
},
 };
 
@@ -259,12 +269,9 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 int pmu_events_table_for_each_event(const struct pmu_events_table *table, 
pmu_event_iter_fn fn,
void *data)
 {
-   for (const struct pmu_event *pe = &table->entries[0]; pe->name || 
pe->metric_expr; pe++) {
-   int ret;
+   for (const struct pmu_event *pe = &table->entries[0]; pe->name; pe++) {
+   int ret = fn(pe, table, data);
 
-   if (!pe->name)
-   continue;
-   ret = fn(pe, table, data);
if (ret)
return ret;
}
@@ -276,19 +283,44 @@ int pmu_events_table_for_each_metric(const struct 
pmu_events_table *etable, pmu_
 {
struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
 
-   for (const struct pmu_metric *pm = &table->entries[0]; pm->name || 
pm->metric_expr; pm++) {
-   int ret;
+   for (const struct pmu_metric *pm = &table->entries[0]; pm->metric_expr; 
pm++) {
+   int ret = fn(pm, etable, data);
 
-   if (!pm->metric_expr)
-   continue;
-   ret = fn(pm, etable, data);
if (ret)
return ret;
}
return 0;
 }
 
-const struct pmu_events_table *perf_pmu__find_table(struct perf_pmu *pmu)
+const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu 
*pmu)
+{
+   const struct pmu_events_table *table = NULL;
+   char *cpuid = perf_pmu__getcpuid(pmu);
+   int i;
+
+   /* on some platforms which uses cpus map, cpuid can be NULL for
+* PMUs other than CORE PMUs.
+*/
+   if (!cpuid)
+   return NULL;
+
+   i = 0;
+   for (;;) {
+   const struct pmu_events_map *map = &pmu_events_map[i++];
+
+   if (!map->cpuid)
+   break;
+
+   if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
+   table = &map->event_table;
+   break;
+   }
+   }
+   free(cpuid);

[PATCH v3 06/11] perf pmu-events: Remove now unused event and metric variables

2023-01-23 Thread Ian Rogers
Previous changes separated the uses of pmu_event and pmu_metric,
however, both structures contained all the variables of event and
metric. This change removes the event variables from metric and the
metric variables from event.

Note, this change removes the setting of evsel's metric_name/expr as
these fields are no longer part of struct pmu_event. The metric
remains but is no longer implicitly requested when the event is. This
impacts a few Intel uncore events, however, as the ScaleUnit is shared
by the event and the metric this utility is questionable. Also the
MetricNames look broken (contain spaces) in some cases and when trying
to use the functionality with '-e' the metrics fail but regular
metrics with '-M' work. For example, on SkylakeX '-M' works:

```
$ perf stat -M LLC_MISSES.PCIE_WRITE -a sleep 1

 Performance counter stats for 'system wide':

 0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 #  57896.0 
Bytes  LLC_MISSES.PCIE_WRITE  (49.84%)
 7,174  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 
   (49.85%)
 0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 
   (50.16%)
63  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 
   (50.15%)

   1.004576381 seconds time elapsed
```

whilst the event '-e' version is broken even with --group/-g (fwiw, we should 
also remove -g [1]):

```
$ perf stat -g -e LLC_MISSES.PCIE_WRITE -g -a sleep 1
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE

 Performance counter stats for 'system wide':

27,316 Bytes LLC_MISSES.PCIE_WRITE

   1.004505469 seconds time elapsed
```

The code also carries warnings where the user is supposed to select
events for metrics [2] but given the lack of use of such a feature,
let's clean the code and just remove.

[1] https://lore.kernel.org/lkml/20220707195610.303254-1-irog...@google.com/
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/stat-shadow.c?id=01b8957b738f42f96a130079bc951b3cc78c5b8a#n425

Signed-off-by: Ian Rogers 
---
 tools/perf/builtin-list.c  | 20 ++---
 tools/perf/pmu-events/jevents.py   | 20 +
 tools/perf/

[PATCH v3 07/11] perf stat: Remove evsel metric_name/expr

2023-01-23 Thread Ian Rogers
Metrics are their own unit and these variables held broken metrics
previously and now just hold the value NULL. Remove code that used
these variables.

Reviewed-by: John Garry 
Signed-off-by: Ian Rogers 
---
 tools/perf/builtin-stat.c |   1 -
 tools/perf/util/cgroup.c  |   1 -
 tools/perf/util/evsel.c   |   2 -
 tools/perf/util/evsel.h   |   2 -
 tools/perf/util/python.c  |   7 ---
 tools/perf/util/stat-shadow.c | 112 --
 tools/perf/util/stat.h|   1 -
 7 files changed, 126 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9f3e4b257516..5d18a5a6f662 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2524,7 +2524,6 @@ int cmd_stat(int argc, const char **argv)
&stat_config.metric_events);
zfree(&metrics);
}
-   perf_stat__collect_metric_expr(evsel_list);
perf_stat__init_shadow_stats();
 
if (add_default_attributes())
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index cd978c240e0d..bfb13306d82c 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -481,7 +481,6 @@ int evlist__expand_cgroup(struct evlist *evlist, const char 
*str,
nr_cgroups++;
 
if (metric_events) {
-   perf_stat__collect_metric_expr(tmp_list);
if (metricgroup__copy_metric_events(tmp_list, cgrp,
metric_events,

&orig_metric_events) < 0)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8550638587e5..a90e998826e0 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -285,8 +285,6 @@ void evsel__init(struct evsel *evsel,
evsel->sample_size = __evsel__sample_size(attr->sample_type);
evsel__calc_id_pos(evsel);
evsel->cmdline_group_boundary = false;
-   evsel->metric_expr   = NULL;
-   evsel->metric_name   = NULL;
evsel->metric_events = NULL;
evsel->per_pkg_mask  = NULL;
evsel->collect_stat  = false;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d572be41b960..24cb807ef6ce 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -105,8 +105,6 @@ struct evsel {
 * metric fields are similar, but needs more care as they can have
 * references to other metric (evsel).
 */
-   const char *metric_expr;
-   const char *metric_name;
struct evsel**metric_events;
struct evsel*metric_leader;
 
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 9e5d881b0987..42e8b813d010 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -76,13 +76,6 @@ const char *perf_env__arch(struct perf_env *env 
__maybe_unused)
return NULL;
 }
 
-/*
- * Add this one here not to drag util/stat-shadow.c
- */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-}
-
 /*
  * These ones are needed not to drag the PMU bandwagon, jevents generated
  * pmu_sys_event_tables, etc and evsel__find_pmu() is used so far just for
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index cadb2df23c87..35ea4813f468 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -346,114 +346,6 @@ static const char *get_ratio_color(enum grc_type type, 
double ratio)
return color;
 }
 
-static struct evsel *perf_stat__find_event(struct evlist *evsel_list,
-   const char *name)
-{
-   struct evsel *c2;
-
-   evlist__for_each_entry (evsel_list, c2) {
-   if (!strcasecmp(c2->name, name) && !c2->collect_stat)
-   return c2;
-   }
-   return NULL;
-}
-
-/* Mark MetricExpr target events and link events using them to them. */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-   struct evsel *counter, *leader, **metric_events, *oc;
-   bool found;
-   struct expr_parse_ctx *ctx;
-   struct hashmap_entry *cur;
-   size_t bkt;
-   int i;
-
-   ctx = expr__ctx_new();
-   if (!ctx) {
-   pr_debug("expr__ctx_new failed");
-   return;
-   }
-   evlist__for_each_entry(evsel_list, counter) {
-   bool invalid = false;
-
-   leader = evsel__leader(counter);
-   if (!counter->metric_expr)
-   continue;
-
-   expr__ctx_clear(ctx);
-   metric_events = counter->metric_events;
-   if (!metric_events) {
-   if (expr__find_ids(counter->metric_expr,
-

[PATCH v3 08/11] perf jevents: Combine table prefix and suffix writing

2023-01-23 Thread Ian Rogers
Combine into a single function to simplify, in a later change, writing
metrics separately.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 36 +---
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 4cdbf34b7298..5f8d490c7269 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -19,10 +19,10 @@ _sys_event_tables = []
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
-# Track whether an events table is currently being defined and needs closing.
-_close_table = False
 # Events to write out when the table is closed
 _pending_events = []
+# Name of table to be written out
+_pending_events_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
@@ -378,24 +378,13 @@ def preprocess_arch_std_files(archpath: str) -> None:
   _arch_std_events[event.metric_name.lower()] = event
 
 
-def print_events_table_prefix(tblname: str) -> None:
-  """Called when a new events table is started."""
-  global _close_table
-  if _close_table:
-raise IOError('Printing table prefix but last table has no suffix')
-  _args.output_file.write(f'static const struct compact_pmu_event {tblname}[] 
= {{\n')
-  _close_table = True
-
-
 def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
   """Add contents of file to _pending_events table."""
-  if not _close_table:
-raise IOError('Table entries missing prefix')
   for e in read_json_events(item.path, topic):
 _pending_events.append(e)
 
 
-def print_events_table_suffix() -> None:
+def print_pending_events() -> None:
   """Optionally close events table."""
 
   def event_cmp_key(j: JsonEvent) -> Tuple[bool, str, str, str, str]:
@@ -407,17 +396,19 @@ def print_events_table_suffix() -> None:
 return (j.desc is not None, fix_none(j.topic), fix_none(j.name), 
fix_none(j.pmu),
 fix_none(j.metric_name))
 
-  global _close_table
-  if not _close_table:
+  global _pending_events
+  if not _pending_events:
 return
 
-  global _pending_events
+  global _pending_events_tblname
+  _args.output_file.write(
+  f'static const struct compact_pmu_event {_pending_events_tblname}[] = 
{{\n')
+
   for event in sorted(_pending_events, key=event_cmp_key):
 _args.output_file.write(event.to_c_string())
-_pending_events = []
+  _pending_events = []
 
   _args.output_file.write('};\n\n')
-  _close_table = False
 
 def get_topic(topic: str) -> str:
   if topic.endswith('metrics.json'):
@@ -455,12 +446,13 @@ def process_one_file(parents: Sequence[str], item: 
os.DirEntry) -> None:
 
   # model directory, reset topic
   if item.is_dir() and is_leaf_dir(item.path):
-print_events_table_suffix()
+print_pending_events()
 
 tblname = file_name_to_table_name(parents, item.name)
 if item.name == 'sys':
   _sys_event_tables.append(tblname)
-print_events_table_prefix(tblname)
+global _pending_events_tblname
+_pending_events_tblname = tblname
 return
 
   # base dir or too deep
@@ -809,7 +801,7 @@ struct compact_pmu_event {
   for arch in archs:
 arch_path = f'{_args.starting_dir}/{arch}'
 ftw(arch_path, [], process_one_file)
-print_events_table_suffix()
+print_pending_events()
 
   print_mapping_table(archs)
   print_system_mapping_table()
-- 
2.39.0.246.g2a6d74b583-goog



[PATCH v3 09/11] perf pmu-events: Introduce pmu_metrics_table

2023-01-23 Thread Ian Rogers
Add a metrics table that is just a cast from pmu_events_table. This
changes the APIs so that event and metric usage of the underlying
table is different. For the no jevents case the tables are already
separate, later changes will separate the tables for the jevents case.

Signed-off-by: Ian Rogers 
---
 tools/perf/arch/arm64/util/pmu.c | 11 +-
 tools/perf/pmu-events/empty-pmu-events.c | 21 +-
 tools/perf/pmu-events/jevents.py | 21 +++---
 tools/perf/pmu-events/pmu-events.h   | 10 +++--
 tools/perf/tests/expand-cgroup.c |  2 +-
 tools/perf/tests/parse-metric.c  |  2 +-
 tools/perf/tests/pmu-events.c|  5 ++-
 tools/perf/util/metricgroup.c| 50 
 tools/perf/util/metricgroup.h|  2 +-
 tools/perf/util/pmu.c|  5 +++
 tools/perf/util/pmu.h|  1 +
 11 files changed, 76 insertions(+), 54 deletions(-)

diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
index 801bf52e2ea6..2779840d8896 100644
--- a/tools/perf/arch/arm64/util/pmu.c
+++ b/tools/perf/arch/arm64/util/pmu.c
@@ -22,7 +22,14 @@ static struct perf_pmu *pmu__find_core_pmu(void)
return NULL;
 
return pmu;
-   }
+}
+
+const struct pmu_metrics_table *pmu_metrics_table__find(void)
+{
+   struct perf_pmu *pmu = pmu__find_core_pmu();
+
+   if (pmu)
+   return perf_pmu__find_metrics_table(pmu);
 
return NULL;
 }
@@ -32,7 +39,7 @@ const struct pmu_events_table *pmu_events_table__find(void)
struct perf_pmu *pmu = pmu__find_core_pmu();
 
if (pmu)
-   return perf_pmu__find_table(pmu);
+   return perf_pmu__find_events_table(pmu);
 
return NULL;
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 10bd4943ebf8..a938b74cf487 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -278,13 +278,11 @@ int pmu_events_table_for_each_event(const struct 
pmu_events_table *table, pmu_ev
return 0;
 }
 
-int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
-void *data)
+int pmu_metrics_table_for_each_metric(const struct pmu_metrics_table *table, 
pmu_metric_iter_fn fn,
+ void *data)
 {
-   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
-
for (const struct pmu_metric *pm = &table->entries[0]; pm->metric_expr; 
pm++) {
-   int ret = fn(pm, etable, data);
+   int ret = fn(pm, table, data);
 
if (ret)
return ret;
@@ -320,9 +318,9 @@ const struct pmu_events_table 
*perf_pmu__find_events_table(struct perf_pmu *pmu)
return table;
 }
 
-const struct pmu_events_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
+const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
 {
-   const struct pmu_events_table *table = NULL;
+   const struct pmu_metrics_table *table = NULL;
char *cpuid = perf_pmu__getcpuid(pmu);
int i;
 
@@ -340,7 +338,7 @@ const struct pmu_events_table 
*perf_pmu__find_metrics_table(struct perf_pmu *pmu
break;
 
if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
-   table = (const struct pmu_events_table 
*)&map->metric_table;
+   table = &map->metric_table;
break;
}
}
@@ -359,13 +357,13 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
return NULL;
 }
 
-const struct pmu_events_table *find_core_metrics_table(const char *arch, const 
char *cpuid)
+const struct pmu_metrics_table *find_core_metrics_table(const char *arch, 
const char *cpuid)
 {
for (const struct pmu_events_map *tables = &pmu_events_map[0];
 tables->arch;
 tables++) {
if (!strcmp(tables->arch, arch) && 
!strcmp_cpuid_str(tables->cpuid, cpuid))
-   return (const struct pmu_events_table 
*)&tables->metric_table;
+   return &tables->metric_table;
}
return NULL;
 }
@@ -386,8 +384,7 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void 
*data)
for (const struct pmu_events_map *tables = &pmu_events_map[0];
 tables->arch;
 tables++) {
-   int ret = pmu_events_table_for_each_metric(
-   (const struct pmu_events_table *)&tables->metric_table, 
fn, data);
+   int ret = 
pmu_metrics_table_for_each_metric(&tables->metric_table, fn, data);
 
if (ret)
return ret;
diff --git a/too

[PATCH v3 10/11] perf jevents: Generate metrics and events as separate tables

2023-01-23 Thread Ian Rogers
Turn a perf json event into an event, metric or both. This reduces the
number of events needed to scan to find an event or metric. As events
no longer need the relatively seldom used metric fields, 4 bytes is
saved per event. This reduces the big C string's size by 335kb (14.8%)
on x86.

Note, for the test PMU architecture pme_test_soc_cpu is renamed
pmu_events__test_soc_cpu for consistency with the event vs metric
naming convention.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 244 +++
 tools/perf/tests/pmu-events.c|   3 +-
 2 files changed, 189 insertions(+), 58 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index d83cc94af51f..627ee817f57f 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -13,28 +13,40 @@ import collections
 
 # Global command line arguments.
 _args = None
+# List of regular event tables.
+_event_tables = []
 # List of event tables generated from "/sys" directories.
 _sys_event_tables = []
+# List of regular metric tables.
+_metric_tables = []
+# List of metric tables generated from "/sys" directories.
+_sys_metric_tables = []
+# Mapping between sys event table names and sys metric table names.
+_sys_event_table_to_metric_table_mapping = {}
 # Map from an event name to an architecture standard
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
 # Events to write out when the table is closed
 _pending_events = []
-# Name of table to be written out
+# Name of events table to be written out
 _pending_events_tblname = None
+# Metrics to write out when the table is closed
+_pending_metrics = []
+# Name of metrics table to be written out
+_pending_metrics_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
 _json_event_attributes = [
 # cmp_sevent related attributes.
-'name', 'pmu', 'topic', 'desc', 'metric_name', 'metric_group',
+'name', 'pmu', 'topic', 'desc',
 # Seems useful, put it early.
 'event',
 # Short things in alphabetical order.
 'aggr_mode', 'compat', 'deprecated', 'perpkg', 'unit',
 # Longer things (the last won't be iterated over during decompress).
-'metric_constraint', 'metric_expr', 'long_desc'
+'long_desc'
 ]
 
 # Attributes that are in pmu_metric rather than pmu_event.
@@ -52,14 +64,16 @@ def removesuffix(s: str, suffix: str) -> str:
   return s[0:-len(suffix)] if s.endswith(suffix) else s
 
 
-def file_name_to_table_name(parents: Sequence[str], dirname: str) -> str:
+def file_name_to_table_name(prefix: str, parents: Sequence[str],
+dirname: str) -> str:
   """Generate a C table name from directory names."""
-  tblname = 'pme'
+  tblname = prefix
   for p in parents:
 tblname += '_' + p
   tblname += '_' + dirname
   return tblname.replace('-', '_')
 
+
 def c_len(s: str) -> int:
   """Return the length of s a C string
 
@@ -277,7 +291,7 @@ class JsonEvent:
 self.metric_constraint = jd.get('MetricConstraint')
 self.metric_expr = None
 if 'MetricExpr' in jd:
-   self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
+  self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
 
 arch_std = jd.get('ArchStdEvent')
 if precise and self.desc and '(Precise Event)' not in self.desc:
@@ -326,23 +340,24 @@ class JsonEvent:
 s += f'\t{attr} = {value},\n'
 return s + '}'
 
-  def build_c_string(self) -> str:
+  def build_c_string(self, metric: bool) -> str:
 s = ''
-for attr in _json_event_attributes:
+for attr in _json_metric_attributes if metric else _json_event_attributes:
   x = getattr(self, attr)
-  if x and attr == 'metric_expr':
+  if metric and x and attr == 'metric_expr':
 # Convert parsed metric expressions into a string. Slashes
 # must be doubled in the file.
 x = x.ToPerfJson().replace('\\', '')
   s += f'{x}\\000' if x else '\\000'
 return s
 
-  def to_c_string(self) -> str:
+  def to_c_string(self, metric: bool) -> str:
 """Representation of the event as a C struct initializer."""
 
-s = self.build_c_string()
+s = self.build_c_string(metric)
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
+
 @lru_cache(

[PATCH v3 11/11] perf jevents: Add model list option

2023-01-23 Thread Ian Rogers
This allows the set of generated jevents events and metrics be limited
to a subset of the model names. Appropriate if trying to minimize the
binary size where only a set of models are possible. On ARM64 the
--model selects the implementor rather than model.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/Build  | 3 ++-
 tools/perf/pmu-events/jevents.py | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index 15b9e8fdbffa..a14de24ecb69 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -10,6 +10,7 @@ JEVENTS_PY=  pmu-events/jevents.py
 ifeq ($(JEVENTS_ARCH),)
 JEVENTS_ARCH=$(SRCARCH)
 endif
+JEVENTS_MODEL ?= all
 
 #
 # Locate/process JSON files in pmu-events/arch/
@@ -23,5 +24,5 @@ $(OUTPUT)pmu-events/pmu-events.c: 
pmu-events/empty-pmu-events.c
 else
 $(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JSON_TEST) $(JEVENTS_PY) 
pmu-events/metric.py
$(call rule_mkdir)
-   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
pmu-events/arch $@
+   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
$(JEVENTS_MODEL) pmu-events/arch $@
 endif
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 627ee817f57f..764720858950 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -888,12 +888,19 @@ def main() -> None:
   action: Callable[[Sequence[str], os.DirEntry], None]) -> None:
 """Replicate the directory/file walking behavior of C's file tree walk."""
 for item in os.scandir(path):
+  if (len(parents) == 0 and item.is_dir() and _args.model != 'all' and
+  'test' not in item.name and item.name not in _args.model.split(',')):
+continue
   action(parents, item)
   if item.is_dir():
 ftw(item.path, parents + [item.name], action)
 
   ap = argparse.ArgumentParser()
   ap.add_argument('arch', help='Architecture name like x86')
+  ap.add_argument('model', help='''Select a model such as skylake to
+reduce the code size.  Normally set to "all". For architectures like
+ARM64 with an implementor/model, this selects the implementor.''',
+  default='all')
   ap.add_argument(
   'starting_dir',
   type=dir_path,
-- 
2.39.0.246.g2a6d74b583-goog



[PATCH v4 00/12] jevents/pmu-events improvements

2023-01-25 Thread Ian Rogers
Add an optimization to jevents using the metric code, rewrite metrics
in terms of each other in order to minimize size and improve
readability. For example, on Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

A limitation of the substitutions are that they depend on strict
equality and the shape of the tree. This means that for "a + b + c"
then a substitution of "a + b" will succeed while "b + c" will fail
(the LHS for "+ c" is "a + b" not just "b").

Separate out the events and metrics in the pmu-events tables saving
14.8% in the table size while making it that metrics no longer need to
iterate over all events and vice versa. These changes remove evsel's
direct metric support as the pmu_event no longer has a metric to
populate it. This is a minor issue as the code wasn't working
properly, metrics for this are rare and can still be properly ran
using '-M'.

Add an ability to just build certain models into the jevents generated
pmu-metrics.c code. This functionality is appropriate for operating
systems like ChromeOS, that aim to minimize binary size and know all
the target CPU models.

v4. Better support the implementor/model style --model argument for
jevents.py. Add #slots test fix. On some patches add reviewed-by
John Garry  and Kajol
Jain.
v3. Rebase an incorporate review comments from John Garry
, in particular breaking apart patch 4
into 3 patches. The no jevents breakage and then later fix is
avoided in this series too.
v2. Rebase. Modify the code that skips rewriting a metric with the
same name with itself, to make the name check case insensitive.

Ian Rogers (12):
  perf jevents metric: Correct Function equality
  perf jevents metric: Add ability to rewrite metrics in terms of others
  perf jevents: Rewrite metrics in the same file with each other
  perf pmu-events: Add separate metric from pmu_event
  perf pmu-events: Separate the metrics from events for no jevents
  perf pmu-events: Remove now unused event and metric variables
  perf stat: Remove evsel metric_name/expr
  perf jevents: Combine table prefix and suffix writing
  perf pmu-events: Introduce pmu_metrics_table
  perf jevents: Generate metrics and events as separate tables
  perf jevents: Add model list option
  perf pmu-events: Fix testing with JEVENTS_ARCH=all

 tools/perf/arch/arm64/util/pmu.c |  11 +-
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/builtin-list.c|  20 +-
 tools/perf/builtin-stat.c|   1 -
 tools/perf/pmu-events/Build  |   3 +-
 tools/perf/pmu-events/empty-pmu-events.c | 108 ++-
 tools/perf/pmu-events/jevents.py | 357 +++
 tools/perf/pmu-events/metric.py  |  79 -
 tools/perf/pmu-events/metric_test.py |  10 +
 tools/perf/pmu-events/pmu-events.h   |  26 +-
 tools/perf/tests/expand-cgroup.c |   4 +-
 tools/perf/tests/parse-metric.c  |   4 +-
 tools/perf/tests/pmu-events.c|  69 ++---
 tools/perf/util/cgroup.c |   1 -
 tools/perf/util/evsel.c  |   2 -
 tools/perf/util/evsel.h  |   2 -
 tools/perf/util/expr.h   |   1 +
 tools/perf/util/expr.l   |   8 +-
 tools/perf/util/metricgroup.c| 207 +++--
 tools/perf/util/metricgroup.h|   4 +-
 tools/perf/util/parse-events.c   |   2 -
 tools/perf/util/pmu.c|  44 +--
 tools/perf/util/pmu.h|  10 +-
 tools/perf/util/print-events.c   |  32 +-
 tools/perf/util/print-events.h   |   3 +-
 tools/perf/util/python.c |   7 -
 tools/perf/util/stat-shadow.c| 112 ---
 tools/perf/util/stat.h   |   1 -
 28 files changed, 666 insertions(+), 466 deletions(-)

-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v4 01/12] perf jevents metric: Correct Function equality

2023-01-25 Thread Ian Rogers
rhs may not be defined, say for source_count, so add a guard.

Reviewed-by: Kajol Jain
Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 4797ed4fd817..2f2fd220e843 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -261,8 +261,10 @@ class Function(Expression):
 
   def Equals(self, other: Expression) -> bool:
 if isinstance(other, Function):
-  return self.fn == other.fn and self.lhs.Equals(
-  other.lhs) and self.rhs.Equals(other.rhs)
+  result = self.fn == other.fn and self.lhs.Equals(other.lhs)
+  if self.rhs:
+result = result and self.rhs.Equals(other.rhs)
+  return result
 return False
 
 
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v4 02/12] perf jevents metric: Add ability to rewrite metrics in terms of others

2023-01-25 Thread Ian Rogers
Add RewriteMetricsInTermsOfOthers that iterates over pairs of names
and expressions trying to replace an expression, within the current
expression, with its name.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py  | 73 +++-
 tools/perf/pmu-events/metric_test.py | 10 
 2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 2f2fd220e843..ed13efac7389 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -4,7 +4,7 @@ import ast
 import decimal
 import json
 import re
-from typing import Dict, List, Optional, Set, Union
+from typing import Dict, List, Optional, Set, Tuple, Union
 
 
 class Expression:
@@ -26,6 +26,9 @@ class Expression:
 """Returns true when two expressions are the same."""
 raise NotImplementedError()
 
+  def Substitute(self, name: str, expression: 'Expression') -> 'Expression':
+raise NotImplementedError()
+
   def __str__(self) -> str:
 return self.ToPerfJson()
 
@@ -186,6 +189,15 @@ class Operator(Expression):
   other.lhs) and self.rhs.Equals(other.rhs)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Operator(self.operator, lhs, rhs)
+
 
 class Select(Expression):
   """Represents a select ternary in the parse tree."""
@@ -225,6 +237,14 @@ class Select(Expression):
   other.false_val) and self.true_val.Equals(other.true_val)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+true_val = self.true_val.Substitute(name, expression)
+cond = self.cond.Substitute(name, expression)
+false_val = self.false_val.Substitute(name, expression)
+return Select(true_val, cond, false_val)
+
 
 class Function(Expression):
   """A function in an expression like min, max, d_ratio."""
@@ -267,6 +287,15 @@ class Function(Expression):
   return result
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Function(self.fn, lhs, rhs)
+
 
 def _FixEscapes(s: str) -> str:
   s = re.sub(r'([^\\]),', r'\1\\,', s)
@@ -293,6 +322,9 @@ class Event(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Event) and self.name == other.name
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Constant(Expression):
   """A constant within the expression tree."""
@@ -317,6 +349,9 @@ class Constant(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Constant) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Literal(Expression):
   """A runtime literal within the expression tree."""
@@ -336,6 +371,9 @@ class Literal(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Literal) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 def min(lhs: Union[int, float, Expression], rhs: Union[int, float,
Expression]) -> 
Function:
@@ -461,6 +499,7 @@ class MetricGroup:
 
 
 class _RewriteIfExpToSelect(ast.NodeTransformer):
+  """Transformer to convert if-else nodes to Select expressions."""
 
   def visit_IfExp(self, node):
 # pylint: disable=invalid-name
@@ -498,7 +537,37 @@ def ParsePerfJson(orig: str) -> Expression:
   for kw in keywords:
 py = re.sub(rf'Event\(r"{kw}"\)', kw, py)
 
-  parsed = ast.parse(py, mode='eval')
+  try:
+parsed = ast.parse(py, mode='eval')
+  except SyntaxError as e:
+raise SyntaxError(f'Parsing expression:\n{orig}') from e
   _RewriteIfExpToSelect().visit(parsed)
   parsed = ast.fix_missing_locations(parsed)
   return _Constify(eval(compile(parsed, orig, 'eval')))
+
+
+def RewriteMetricsInTermsOfOthers(metrics: list[Tuple[str, Expression]]
+  )-> Dict[str, Expression]:
+  """Shorten metrics by rewriting in terms of others.
+
+  Args:
+metrics 

[PATCH v4 03/12] perf jevents: Rewrite metrics in the same file with each other

2023-01-25 Thread Ian Rogers
Rewrite metrics within the same file in terms of each other. For example, on 
Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

To avoid recomputation decorate the function with a cache.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 0416b7442171..15a1671740cc 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -3,6 +3,7 @@
 """Convert directories of JSON events to C code."""
 import argparse
 import csv
+from functools import lru_cache
 import json
 import metric
 import os
@@ -337,18 +338,28 @@ class JsonEvent:
 s = self.build_c_string()
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
-
+@lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events from the specified file."""
-
   try:
-result = json.load(open(path), object_hook=JsonEvent)
+events = json.load(open(path), object_hook=JsonEvent)
   except BaseException as err:
 print(f"Exception processing {path}")
 raise
-  for event in result:
+  metrics: list[Tuple[str, metric.Expression]] = []
+  for event in events:
 event.topic = topic
-  return result
+if event.metric_name and '-' not in event.metric_name:
+  metrics.append((event.metric_name, event.metric_expr))
+  updates = metric.RewriteMetricsInTermsOfOthers(metrics)
+  if updates:
+for event in events:
+  if event.metric_name in updates:
+# print(f'Updated {event.metric_name} from\n"{event.metric_expr}"\n'
+#   f'to\n"{updates[event.metric_name]}"')
+event.metric_expr = updates[event.metric_name]
+
+  return events
 
 def preprocess_arch_std_files(archpath: str) -> None:
   """Read in all architecture standard events."""
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v4 04/12] perf pmu-events: Add separate metric from pmu_event

2023-01-25 Thread Ian Rogers
Create a new pmu_metric for the metric related variables from
pmu_event but that is initially just a clone of pmu_event. Add
iterators for pmu_metric and use in places that metrics are desired
rather than events. Make the event iterator skip metric only events,
and the metric iterator skip event only events.

Reviewed-by: John Garry 
Signed-off-by: Ian Rogers 
---
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/pmu-events/empty-pmu-events.c |  49 ++-
 tools/perf/pmu-events/jevents.py |  62 -
 tools/perf/pmu-events/pmu-events.h   |  26 
 tools/perf/tests/pmu-events.c|  35 +++--
 tools/perf/util/metricgroup.c| 161 +++
 tools/perf/util/metricgroup.h|   2 +-
 7 files changed, 228 insertions(+), 111 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index e8fe36b10d20..78eef77d8a8d 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -40,11 +40,11 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
return bufp;
 }
 
-int arch_get_runtimeparam(const struct pmu_event *pe)
+int arch_get_runtimeparam(const struct pmu_metric *pm)
 {
int count;
char path[PATH_MAX] = "/devices/hv_24x7/interface/";
 
-   atoi(pe->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
+   atoi(pm->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
return sysfs__read_int(path, &count) < 0 ? 1 : count;
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 480e8f0d30c8..4e39d1a8d6d6 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -181,6 +181,11 @@ struct pmu_events_table {
const struct pmu_event *entries;
 };
 
+/* Struct used to make the PMU metric table implementation opaque to callers. 
*/
+struct pmu_metrics_table {
+   const struct pmu_metric *entries;
+};
+
 /*
  * Map a CPU to its table of PMU events. The CPU is identified by the
  * cpuid field, which is an arch-specific identifier for the CPU.
@@ -254,11 +259,29 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 int pmu_events_table_for_each_event(const struct pmu_events_table *table, 
pmu_event_iter_fn fn,
void *data)
 {
-   for (const struct pmu_event *pe = &table->entries[0];
-pe->name || pe->metric_group || pe->metric_name;
-pe++) {
-   int ret = fn(pe, table, data);
+   for (const struct pmu_event *pe = &table->entries[0]; pe->name || 
pe->metric_expr; pe++) {
+   int ret;
 
+   if (!pe->name)
+   continue;
+   ret = fn(pe, table, data);
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
+void *data)
+{
+   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
+
+   for (const struct pmu_metric *pm = &table->entries[0]; pm->name || 
pm->metric_expr; pm++) {
+   int ret;
+
+   if (!pm->metric_expr)
+   continue;
+   ret = fn(pm, etable, data);
if (ret)
return ret;
}
@@ -305,11 +328,22 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
 }
 
 int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data)
+{
+   for (const struct pmu_events_map *tables = &pmu_events_map[0]; 
tables->arch; tables++) {
+   int ret = pmu_events_table_for_each_event(&tables->table, fn, 
data);
+
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data)
 {
for (const struct pmu_events_map *tables = &pmu_events_map[0];
 tables->arch;
 tables++) {
-   int ret = pmu_events_table_for_each_event(&tables->table, fn, 
data);
+   int ret = pmu_events_table_for_each_metric(&tables->table, fn, 
data);
 
if (ret)
return ret;
@@ -340,3 +374,8 @@ int pmu_for_each_sys_event(pmu_event_iter_fn fn, void *data)
}
return 0;
 }
+
+int pmu_for_each_sys_metric(pmu_metric_iter_fn fn __maybe_unused, void *data 
__maybe_unused)
+{
+   return 0;
+}
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 15a1671740cc..858787a12302 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -564,7 +564,

[PATCH v4 05/12] perf pmu-events: Separate the metrics from events for no jevents

2023-01-25 Thread Ian Rogers
Separate the event and metric table when building without jevents. Add
find_core_metrics_table and perf_pmu__find_metrics_table while
renaming existing utilities to be event specific, so that users can
find the right table for their need.

Reviewed-by: John Garry 
Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/empty-pmu-events.c | 88 ++--
 tools/perf/pmu-events/jevents.py |  7 +-
 tools/perf/pmu-events/pmu-events.h   |  4 +-
 tools/perf/tests/expand-cgroup.c |  2 +-
 tools/perf/tests/parse-metric.c  |  2 +-
 tools/perf/util/pmu.c|  4 +-
 6 files changed, 79 insertions(+), 28 deletions(-)

diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 4e39d1a8d6d6..10bd4943ebf8 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-static const struct pmu_event pme_test_soc_cpu[] = {
+static const struct pmu_event pmu_events__test_soc_cpu[] = {
{
.name = "l3_cache_rd",
.event = "event=0x40",
@@ -105,6 +105,14 @@ static const struct pmu_event pme_test_soc_cpu[] = {
.desc = "L2 BTB Correction",
.topic = "branch",
},
+   {
+   .name = 0,
+   .event = 0,
+   .desc = 0,
+   },
+};
+
+static const struct pmu_metric pmu_metrics__test_soc_cpu[] = {
{
.metric_expr= "1 / IPC",
.metric_name= "CPI",
@@ -170,9 +178,8 @@ static const struct pmu_event pme_test_soc_cpu[] = {
.metric_name= "L1D_Cache_Fill_BW",
},
{
-   .name = 0,
-   .event = 0,
-   .desc = 0,
+   .metric_expr = 0,
+   .metric_name = 0,
},
 };
 
@@ -197,7 +204,8 @@ struct pmu_metrics_table {
 struct pmu_events_map {
const char *arch;
const char *cpuid;
-   const struct pmu_events_table table;
+   const struct pmu_events_table event_table;
+   const struct pmu_metrics_table metric_table;
 };
 
 /*
@@ -208,12 +216,14 @@ static const struct pmu_events_map pmu_events_map[] = {
{
.arch = "testarch",
.cpuid = "testcpu",
-   .table = { pme_test_soc_cpu },
+   .event_table = { pmu_events__test_soc_cpu },
+   .metric_table = { pmu_metrics__test_soc_cpu },
},
{
.arch = 0,
.cpuid = 0,
-   .table = { 0 },
+   .event_table = { 0 },
+   .metric_table = { 0 },
},
 };
 
@@ -259,12 +269,9 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 int pmu_events_table_for_each_event(const struct pmu_events_table *table, 
pmu_event_iter_fn fn,
void *data)
 {
-   for (const struct pmu_event *pe = &table->entries[0]; pe->name || 
pe->metric_expr; pe++) {
-   int ret;
+   for (const struct pmu_event *pe = &table->entries[0]; pe->name; pe++) {
+   int ret = fn(pe, table, data);
 
-   if (!pe->name)
-   continue;
-   ret = fn(pe, table, data);
if (ret)
return ret;
}
@@ -276,19 +283,44 @@ int pmu_events_table_for_each_metric(const struct 
pmu_events_table *etable, pmu_
 {
struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
 
-   for (const struct pmu_metric *pm = &table->entries[0]; pm->name || 
pm->metric_expr; pm++) {
-   int ret;
+   for (const struct pmu_metric *pm = &table->entries[0]; pm->metric_expr; 
pm++) {
+   int ret = fn(pm, etable, data);
 
-   if (!pm->metric_expr)
-   continue;
-   ret = fn(pm, etable, data);
if (ret)
return ret;
}
return 0;
 }
 
-const struct pmu_events_table *perf_pmu__find_table(struct perf_pmu *pmu)
+const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu 
*pmu)
+{
+   const struct pmu_events_table *table = NULL;
+   char *cpuid = perf_pmu__getcpuid(pmu);
+   int i;
+
+   /* on some platforms which uses cpus map, cpuid can be NULL for
+* PMUs other than CORE PMUs.
+*/
+   if (!cpuid)
+   return NULL;
+
+   i = 0;
+   for (;;) {
+   const struct pmu_events_map *map = &pmu_events_map[i++];
+
+   if (!map->cpuid)
+   break;
+
+   if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
+   table = &map->event_table;
+   break;
+   }
+ 

[PATCH v4 06/12] perf pmu-events: Remove now unused event and metric variables

2023-01-25 Thread Ian Rogers
Previous changes separated the uses of pmu_event and pmu_metric,
however, both structures contained all the variables of event and
metric. This change removes the event variables from metric and the
metric variables from event.

Note, this change removes the setting of evsel's metric_name/expr as
these fields are no longer part of struct pmu_event. The metric
remains but is no longer implicitly requested when the event is. This
impacts a few Intel uncore events, however, as the ScaleUnit is shared
by the event and the metric this utility is questionable. Also the
MetricNames look broken (contain spaces) in some cases and when trying
to use the functionality with '-e' the metrics fail but regular
metrics with '-M' work. For example, on SkylakeX '-M' works:

```
$ perf stat -M LLC_MISSES.PCIE_WRITE -a sleep 1

 Performance counter stats for 'system wide':

 0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 #  57896.0 
Bytes  LLC_MISSES.PCIE_WRITE  (49.84%)
 7,174  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 
   (49.85%)
 0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 
   (50.16%)
63  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 
   (50.15%)

   1.004576381 seconds time elapsed
```

whilst the event '-e' version is broken even with --group/-g (fwiw, we should 
also remove -g [1]):

```
$ perf stat -g -e LLC_MISSES.PCIE_WRITE -g -a sleep 1
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE

 Performance counter stats for 'system wide':

27,316 Bytes LLC_MISSES.PCIE_WRITE

   1.004505469 seconds time elapsed
```

The code also carries warnings where the user is supposed to select
events for metrics [2] but given the lack of use of such a feature,
let's clean the code and just remove.

[1] https://lore.kernel.org/lkml/20220707195610.303254-1-irog...@google.com/
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/stat-shadow.c?id=01b8957b738f42f96a130079bc951b3cc78c5b8a#n425

Reviewed-by: John Garry 
Signed-off-by: Ian Rogers 
---
 tools/perf/builtin-list.c  | 20 ++---
 tools/perf/pmu-events/jevent

[PATCH v4 07/12] perf stat: Remove evsel metric_name/expr

2023-01-25 Thread Ian Rogers
Metrics are their own unit and these variables held broken metrics
previously and now just hold the value NULL. Remove code that used
these variables.

Reviewed-by: John Garry 
Signed-off-by: Ian Rogers 
---
 tools/perf/builtin-stat.c |   1 -
 tools/perf/util/cgroup.c  |   1 -
 tools/perf/util/evsel.c   |   2 -
 tools/perf/util/evsel.h   |   2 -
 tools/perf/util/python.c  |   7 ---
 tools/perf/util/stat-shadow.c | 112 --
 tools/perf/util/stat.h|   1 -
 7 files changed, 126 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9f3e4b257516..5d18a5a6f662 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2524,7 +2524,6 @@ int cmd_stat(int argc, const char **argv)
&stat_config.metric_events);
zfree(&metrics);
}
-   perf_stat__collect_metric_expr(evsel_list);
perf_stat__init_shadow_stats();
 
if (add_default_attributes())
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index cd978c240e0d..bfb13306d82c 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -481,7 +481,6 @@ int evlist__expand_cgroup(struct evlist *evlist, const char 
*str,
nr_cgroups++;
 
if (metric_events) {
-   perf_stat__collect_metric_expr(tmp_list);
if (metricgroup__copy_metric_events(tmp_list, cgrp,
metric_events,

&orig_metric_events) < 0)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8550638587e5..a90e998826e0 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -285,8 +285,6 @@ void evsel__init(struct evsel *evsel,
evsel->sample_size = __evsel__sample_size(attr->sample_type);
evsel__calc_id_pos(evsel);
evsel->cmdline_group_boundary = false;
-   evsel->metric_expr   = NULL;
-   evsel->metric_name   = NULL;
evsel->metric_events = NULL;
evsel->per_pkg_mask  = NULL;
evsel->collect_stat  = false;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d572be41b960..24cb807ef6ce 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -105,8 +105,6 @@ struct evsel {
 * metric fields are similar, but needs more care as they can have
 * references to other metric (evsel).
 */
-   const char *metric_expr;
-   const char *metric_name;
struct evsel**metric_events;
struct evsel*metric_leader;
 
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 9e5d881b0987..42e8b813d010 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -76,13 +76,6 @@ const char *perf_env__arch(struct perf_env *env 
__maybe_unused)
return NULL;
 }
 
-/*
- * Add this one here not to drag util/stat-shadow.c
- */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-}
-
 /*
  * These ones are needed not to drag the PMU bandwagon, jevents generated
  * pmu_sys_event_tables, etc and evsel__find_pmu() is used so far just for
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index cadb2df23c87..35ea4813f468 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -346,114 +346,6 @@ static const char *get_ratio_color(enum grc_type type, 
double ratio)
return color;
 }
 
-static struct evsel *perf_stat__find_event(struct evlist *evsel_list,
-   const char *name)
-{
-   struct evsel *c2;
-
-   evlist__for_each_entry (evsel_list, c2) {
-   if (!strcasecmp(c2->name, name) && !c2->collect_stat)
-   return c2;
-   }
-   return NULL;
-}
-
-/* Mark MetricExpr target events and link events using them to them. */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-   struct evsel *counter, *leader, **metric_events, *oc;
-   bool found;
-   struct expr_parse_ctx *ctx;
-   struct hashmap_entry *cur;
-   size_t bkt;
-   int i;
-
-   ctx = expr__ctx_new();
-   if (!ctx) {
-   pr_debug("expr__ctx_new failed");
-   return;
-   }
-   evlist__for_each_entry(evsel_list, counter) {
-   bool invalid = false;
-
-   leader = evsel__leader(counter);
-   if (!counter->metric_expr)
-   continue;
-
-   expr__ctx_clear(ctx);
-   metric_events = counter->metric_events;
-   if (!metric_events) {
-   if (expr__find_ids(counter->metric_expr,
-

[PATCH v4 08/12] perf jevents: Combine table prefix and suffix writing

2023-01-25 Thread Ian Rogers
Combine into a single function to simplify, in a later change, writing
metrics separately.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 36 +---
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 4cdbf34b7298..5f8d490c7269 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -19,10 +19,10 @@ _sys_event_tables = []
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
-# Track whether an events table is currently being defined and needs closing.
-_close_table = False
 # Events to write out when the table is closed
 _pending_events = []
+# Name of table to be written out
+_pending_events_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
@@ -378,24 +378,13 @@ def preprocess_arch_std_files(archpath: str) -> None:
   _arch_std_events[event.metric_name.lower()] = event
 
 
-def print_events_table_prefix(tblname: str) -> None:
-  """Called when a new events table is started."""
-  global _close_table
-  if _close_table:
-raise IOError('Printing table prefix but last table has no suffix')
-  _args.output_file.write(f'static const struct compact_pmu_event {tblname}[] 
= {{\n')
-  _close_table = True
-
-
 def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
   """Add contents of file to _pending_events table."""
-  if not _close_table:
-raise IOError('Table entries missing prefix')
   for e in read_json_events(item.path, topic):
 _pending_events.append(e)
 
 
-def print_events_table_suffix() -> None:
+def print_pending_events() -> None:
   """Optionally close events table."""
 
   def event_cmp_key(j: JsonEvent) -> Tuple[bool, str, str, str, str]:
@@ -407,17 +396,19 @@ def print_events_table_suffix() -> None:
 return (j.desc is not None, fix_none(j.topic), fix_none(j.name), 
fix_none(j.pmu),
 fix_none(j.metric_name))
 
-  global _close_table
-  if not _close_table:
+  global _pending_events
+  if not _pending_events:
 return
 
-  global _pending_events
+  global _pending_events_tblname
+  _args.output_file.write(
+  f'static const struct compact_pmu_event {_pending_events_tblname}[] = 
{{\n')
+
   for event in sorted(_pending_events, key=event_cmp_key):
 _args.output_file.write(event.to_c_string())
-_pending_events = []
+  _pending_events = []
 
   _args.output_file.write('};\n\n')
-  _close_table = False
 
 def get_topic(topic: str) -> str:
   if topic.endswith('metrics.json'):
@@ -455,12 +446,13 @@ def process_one_file(parents: Sequence[str], item: 
os.DirEntry) -> None:
 
   # model directory, reset topic
   if item.is_dir() and is_leaf_dir(item.path):
-print_events_table_suffix()
+print_pending_events()
 
 tblname = file_name_to_table_name(parents, item.name)
 if item.name == 'sys':
   _sys_event_tables.append(tblname)
-print_events_table_prefix(tblname)
+global _pending_events_tblname
+_pending_events_tblname = tblname
 return
 
   # base dir or too deep
@@ -809,7 +801,7 @@ struct compact_pmu_event {
   for arch in archs:
 arch_path = f'{_args.starting_dir}/{arch}'
 ftw(arch_path, [], process_one_file)
-print_events_table_suffix()
+print_pending_events()
 
   print_mapping_table(archs)
   print_system_mapping_table()
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v4 09/12] perf pmu-events: Introduce pmu_metrics_table

2023-01-25 Thread Ian Rogers
Add a metrics table that is just a cast from pmu_events_table. This
changes the APIs so that event and metric usage of the underlying
table is different. For the no jevents case the tables are already
separate, later changes will separate the tables for the jevents case.

Signed-off-by: Ian Rogers 
---
 tools/perf/arch/arm64/util/pmu.c | 11 -
 tools/perf/pmu-events/empty-pmu-events.c | 21 -
 tools/perf/pmu-events/jevents.py | 21 ++---
 tools/perf/pmu-events/pmu-events.h   | 10 +++--
 tools/perf/tests/expand-cgroup.c |  2 +-
 tools/perf/tests/parse-metric.c  |  2 +-
 tools/perf/tests/pmu-events.c|  5 ++-
 tools/perf/util/metricgroup.c| 54 
 tools/perf/util/metricgroup.h|  2 +-
 tools/perf/util/pmu.c|  5 +++
 tools/perf/util/pmu.h|  1 +
 11 files changed, 78 insertions(+), 56 deletions(-)

diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
index 801bf52e2ea6..2779840d8896 100644
--- a/tools/perf/arch/arm64/util/pmu.c
+++ b/tools/perf/arch/arm64/util/pmu.c
@@ -22,7 +22,14 @@ static struct perf_pmu *pmu__find_core_pmu(void)
return NULL;
 
return pmu;
-   }
+}
+
+const struct pmu_metrics_table *pmu_metrics_table__find(void)
+{
+   struct perf_pmu *pmu = pmu__find_core_pmu();
+
+   if (pmu)
+   return perf_pmu__find_metrics_table(pmu);
 
return NULL;
 }
@@ -32,7 +39,7 @@ const struct pmu_events_table *pmu_events_table__find(void)
struct perf_pmu *pmu = pmu__find_core_pmu();
 
if (pmu)
-   return perf_pmu__find_table(pmu);
+   return perf_pmu__find_events_table(pmu);
 
return NULL;
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 10bd4943ebf8..a938b74cf487 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -278,13 +278,11 @@ int pmu_events_table_for_each_event(const struct 
pmu_events_table *table, pmu_ev
return 0;
 }
 
-int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
-void *data)
+int pmu_metrics_table_for_each_metric(const struct pmu_metrics_table *table, 
pmu_metric_iter_fn fn,
+ void *data)
 {
-   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
-
for (const struct pmu_metric *pm = &table->entries[0]; pm->metric_expr; 
pm++) {
-   int ret = fn(pm, etable, data);
+   int ret = fn(pm, table, data);
 
if (ret)
return ret;
@@ -320,9 +318,9 @@ const struct pmu_events_table 
*perf_pmu__find_events_table(struct perf_pmu *pmu)
return table;
 }
 
-const struct pmu_events_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
+const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
 {
-   const struct pmu_events_table *table = NULL;
+   const struct pmu_metrics_table *table = NULL;
char *cpuid = perf_pmu__getcpuid(pmu);
int i;
 
@@ -340,7 +338,7 @@ const struct pmu_events_table 
*perf_pmu__find_metrics_table(struct perf_pmu *pmu
break;
 
if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
-   table = (const struct pmu_events_table 
*)&map->metric_table;
+   table = &map->metric_table;
break;
}
}
@@ -359,13 +357,13 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
return NULL;
 }
 
-const struct pmu_events_table *find_core_metrics_table(const char *arch, const 
char *cpuid)
+const struct pmu_metrics_table *find_core_metrics_table(const char *arch, 
const char *cpuid)
 {
for (const struct pmu_events_map *tables = &pmu_events_map[0];
 tables->arch;
 tables++) {
if (!strcmp(tables->arch, arch) && 
!strcmp_cpuid_str(tables->cpuid, cpuid))
-   return (const struct pmu_events_table 
*)&tables->metric_table;
+   return &tables->metric_table;
}
return NULL;
 }
@@ -386,8 +384,7 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void 
*data)
for (const struct pmu_events_map *tables = &pmu_events_map[0];
 tables->arch;
 tables++) {
-   int ret = pmu_events_table_for_each_metric(
-   (const struct pmu_events_table *)&tables->metric_table, 
fn, data);
+   int ret = 
pmu_metrics_table_for_each_metric(&tables->metric_table, fn, data);
 
if (ret)
return ret;
diff --git a/too

[PATCH v4 10/12] perf jevents: Generate metrics and events as separate tables

2023-01-25 Thread Ian Rogers
Turn a perf json event into an event, metric or both. This reduces the
number of events needed to scan to find an event or metric. As events
no longer need the relatively seldom used metric fields, 4 bytes is
saved per event. This reduces the big C string's size by 335kb (14.8%)
on x86.

Note, for the test PMU architecture pme_test_soc_cpu is renamed
pmu_events__test_soc_cpu for consistency with the event vs metric
naming convention.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 244 +++
 tools/perf/tests/pmu-events.c|   3 +-
 2 files changed, 189 insertions(+), 58 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index d83cc94af51f..627ee817f57f 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -13,28 +13,40 @@ import collections
 
 # Global command line arguments.
 _args = None
+# List of regular event tables.
+_event_tables = []
 # List of event tables generated from "/sys" directories.
 _sys_event_tables = []
+# List of regular metric tables.
+_metric_tables = []
+# List of metric tables generated from "/sys" directories.
+_sys_metric_tables = []
+# Mapping between sys event table names and sys metric table names.
+_sys_event_table_to_metric_table_mapping = {}
 # Map from an event name to an architecture standard
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
 # Events to write out when the table is closed
 _pending_events = []
-# Name of table to be written out
+# Name of events table to be written out
 _pending_events_tblname = None
+# Metrics to write out when the table is closed
+_pending_metrics = []
+# Name of metrics table to be written out
+_pending_metrics_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
 _json_event_attributes = [
 # cmp_sevent related attributes.
-'name', 'pmu', 'topic', 'desc', 'metric_name', 'metric_group',
+'name', 'pmu', 'topic', 'desc',
 # Seems useful, put it early.
 'event',
 # Short things in alphabetical order.
 'aggr_mode', 'compat', 'deprecated', 'perpkg', 'unit',
 # Longer things (the last won't be iterated over during decompress).
-'metric_constraint', 'metric_expr', 'long_desc'
+'long_desc'
 ]
 
 # Attributes that are in pmu_metric rather than pmu_event.
@@ -52,14 +64,16 @@ def removesuffix(s: str, suffix: str) -> str:
   return s[0:-len(suffix)] if s.endswith(suffix) else s
 
 
-def file_name_to_table_name(parents: Sequence[str], dirname: str) -> str:
+def file_name_to_table_name(prefix: str, parents: Sequence[str],
+dirname: str) -> str:
   """Generate a C table name from directory names."""
-  tblname = 'pme'
+  tblname = prefix
   for p in parents:
 tblname += '_' + p
   tblname += '_' + dirname
   return tblname.replace('-', '_')
 
+
 def c_len(s: str) -> int:
   """Return the length of s a C string
 
@@ -277,7 +291,7 @@ class JsonEvent:
 self.metric_constraint = jd.get('MetricConstraint')
 self.metric_expr = None
 if 'MetricExpr' in jd:
-   self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
+  self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
 
 arch_std = jd.get('ArchStdEvent')
 if precise and self.desc and '(Precise Event)' not in self.desc:
@@ -326,23 +340,24 @@ class JsonEvent:
 s += f'\t{attr} = {value},\n'
 return s + '}'
 
-  def build_c_string(self) -> str:
+  def build_c_string(self, metric: bool) -> str:
 s = ''
-for attr in _json_event_attributes:
+for attr in _json_metric_attributes if metric else _json_event_attributes:
   x = getattr(self, attr)
-  if x and attr == 'metric_expr':
+  if metric and x and attr == 'metric_expr':
 # Convert parsed metric expressions into a string. Slashes
 # must be doubled in the file.
 x = x.ToPerfJson().replace('\\', '')
   s += f'{x}\\000' if x else '\\000'
 return s
 
-  def to_c_string(self) -> str:
+  def to_c_string(self, metric: bool) -> str:
 """Representation of the event as a C struct initializer."""
 
-s = self.build_c_string()
+s = self.build_c_string(metric)
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
+
 @lru_cache(

[PATCH v4 11/12] perf jevents: Add model list option

2023-01-25 Thread Ian Rogers
This allows the set of generated jevents events and metrics be limited
to a subset of the model names. Appropriate if trying to minimize the
binary size where only a set of models are possible.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/Build  |  3 ++-
 tools/perf/pmu-events/jevents.py | 14 ++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index 15b9e8fdbffa..a14de24ecb69 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -10,6 +10,7 @@ JEVENTS_PY=  pmu-events/jevents.py
 ifeq ($(JEVENTS_ARCH),)
 JEVENTS_ARCH=$(SRCARCH)
 endif
+JEVENTS_MODEL ?= all
 
 #
 # Locate/process JSON files in pmu-events/arch/
@@ -23,5 +24,5 @@ $(OUTPUT)pmu-events/pmu-events.c: 
pmu-events/empty-pmu-events.c
 else
 $(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JSON_TEST) $(JEVENTS_PY) 
pmu-events/metric.py
$(call rule_mkdir)
-   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
pmu-events/arch $@
+   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
$(JEVENTS_MODEL) pmu-events/arch $@
 endif
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 627ee817f57f..2bcd07ce609f 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -599,6 +599,8 @@ const struct pmu_events_map pmu_events_map[] = {
 else:
   metric_tblname = 'NULL'
   metric_size = '0'
+if event_size == '0' and metric_size == '0':
+  continue
 cpuid = row[0].replace('\\', '')
 _args.output_file.write(f"""{{
 \t.arch = "{arch}",
@@ -888,12 +890,24 @@ def main() -> None:
   action: Callable[[Sequence[str], os.DirEntry], None]) -> None:
 """Replicate the directory/file walking behavior of C's file tree walk."""
 for item in os.scandir(path):
+  if _args.model != 'all' and item.is_dir():
+# Check if the model matches one in _args.model.
+if len(parents) == _args.model.split(',')[0].count('/'):
+  # We're testing the correct directory.
+  item_path = '/'.join(parents) + ('/' if len(parents) > 0 else '') + 
item.name
+  if 'test' not in item_path and item_path not in 
_args.model.split(','):
+continue
   action(parents, item)
   if item.is_dir():
 ftw(item.path, parents + [item.name], action)
 
   ap = argparse.ArgumentParser()
   ap.add_argument('arch', help='Architecture name like x86')
+  ap.add_argument('model', help='''Select a model such as skylake to
+reduce the code size.  Normally set to "all". For architectures like
+ARM64 with an implementor/model, the model must include the implementor
+such as "arm/cortex-a34".''',
+  default='all')
   ap.add_argument(
   'starting_dir',
   type=dir_path,
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v4 12/12] perf pmu-events: Fix testing with JEVENTS_ARCH=all

2023-01-25 Thread Ian Rogers
The #slots literal will return NAN when not on ARM64 which causes a
perf test failure when not on an ARM64 for a JEVENTS_ARCH=all build:
..
 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!
..
Add an is_test boolean so that the failure can be avoided when running
as a test.

Fixes: acef233b7ca7 ("perf pmu: Add #slots literal support for arm64")
Signed-off-by: Ian Rogers 
---
 tools/perf/tests/pmu-events.c | 1 +
 tools/perf/util/expr.h| 1 +
 tools/perf/util/expr.l| 8 +---
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/tests/pmu-events.c b/tools/perf/tests/pmu-events.c
index 962c3c0d53ba..accf44b3d968 100644
--- a/tools/perf/tests/pmu-events.c
+++ b/tools/perf/tests/pmu-events.c
@@ -950,6 +950,7 @@ static int metric_parse_fake(const char *metric_name, const 
char *str)
pr_debug("expr__ctx_new failed");
return TEST_FAIL;
}
+   ctx->sctx.is_test = true;
if (expr__find_ids(str, NULL, ctx) < 0) {
pr_err("expr__find_ids failed\n");
return -1;
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index 029271540fb0..eaa44b24c555 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -9,6 +9,7 @@ struct expr_scanner_ctx {
char *user_requested_cpu_list;
int runtime;
bool system_wide;
+   bool is_test;
 };
 
 struct expr_parse_ctx {
diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
index 0168a9637330..72ff4f3d6d4b 100644
--- a/tools/perf/util/expr.l
+++ b/tools/perf/util/expr.l
@@ -84,9 +84,11 @@ static int literal(yyscan_t scanner, const struct 
expr_scanner_ctx *sctx)
YYSTYPE *yylval = expr_get_lval(scanner);
 
yylval->num = expr__get_literal(expr_get_text(scanner), sctx);
-   if (isnan(yylval->num))
-   return EXPR_ERROR;
-
+   if (isnan(yylval->num)) {
+   if (!sctx->is_test)
+   return EXPR_ERROR;
+   yylval->num = 1;
+   }
return LITERAL;
 }
 %}
-- 
2.39.1.456.gfc5497dd1b-goog



Re: [PATCH v4 02/12] perf jevents metric: Add ability to rewrite metrics in terms of others

2023-01-26 Thread Ian Rogers
On Thu, Jan 26, 2023 at 7:59 AM John Garry  wrote:
>
> On 26/01/2023 01:18, Ian Rogers wrote:
> > Add RewriteMetricsInTermsOfOthers that iterates over pairs of names
> > and expressions trying to replace an expression, within the current
> > expression, with its name.
> >
> > Signed-off-by: Ian Rogers 
>
> hmmm ... did you test this for many python versions?
>
> Maybe this patch causes this error:
>
> Traceback (most recent call last):
>   File "pmu-events/jevents.py", line 7, in 
> import metric
>   File "/home/john/acme/tools/perf/pmu-events/metric.py", line 549, in
> 
> def RewriteMetricsInTermsOfOthers(metrics: list[Tuple[str, Expression]]
> TypeError: 'type' object is not subscriptable
> make[3]: *** [pmu-events/Build:26: pmu-events/pmu-events.c] Error 1
> make[2]: *** [Makefile.perf:676: pmu-events/pmu-events-in.o] Error 2
> make[2]: *** Waiting for unfinished jobs
>
> I have python 3.6.15
>
> Thanks,
> John

Apologies, I have to test python3.6 with docker and so if I think the
change is small enough.. My error, will spin v5.

Thanks,
Ian

>


[PATCH v5 00/15] jevents/pmu-events improvements

2023-01-26 Thread Ian Rogers
Add an optimization to jevents using the metric code, rewrite metrics
in terms of each other in order to minimize size and improve
readability. For example, on Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

A limitation of the substitutions are that they depend on strict
equality and the shape of the tree. This means that for "a + b + c"
then a substitution of "a + b" will succeed while "b + c" will fail
(the LHS for "+ c" is "a + b" not just "b").

Separate out the events and metrics in the pmu-events tables saving
14.8% in the table size while making it that metrics no longer need to
iterate over all events and vice versa. These changes remove evsel's
direct metric support as the pmu_event no longer has a metric to
populate it. This is a minor issue as the code wasn't working
properly, metrics for this are rare and can still be properly ran
using '-M'.

Add an ability to just build certain models into the jevents generated
pmu-metrics.c code. This functionality is appropriate for operating
systems like ChromeOS, that aim to minimize binary size and know all
the target CPU models.

v5. s/list/List/ in a type annotation to fix Python 3.6 as reported by
John Garry . Fix a bug in metric_test.py
where a bad character was imported. To avoid similar regressions,
run metric_test.py before generating pmu-events.c.
v4. Better support the implementor/model style --model argument for
jevents.py. Add #slots test fix. On some patches add reviewed-by
John Garry  and Kajol
Jain.
v3. Rebase an incorporate review comments from John Garry
, in particular breaking apart patch 4
into 3 patches. The no jevents breakage and then later fix is
avoided in this series too.
v2. Rebase. Modify the code that skips rewriting a metric with the
same name with itself, to make the name check case insensitive.

Ian Rogers (15):
  perf jevents metric: Correct Function equality
  perf jevents metric: Add ability to rewrite metrics in terms of others
  perf jevents: Rewrite metrics in the same file with each other
  perf pmu-events: Add separate metric from pmu_event
  perf pmu-events: Separate the metrics from events for no jevents
  perf pmu-events: Remove now unused event and metric variables
  perf stat: Remove evsel metric_name/expr
  perf jevents: Combine table prefix and suffix writing
  perf pmu-events: Introduce pmu_metrics_table
  perf jevents: Generate metrics and events as separate tables
  perf jevents: Add model list option
  perf pmu-events: Fix testing with JEVENTS_ARCH=all
  perf jevents: Correct bad character encoding
  tools build: Add test echo-cmd
  perf jevents: Run metric_test.py at compile-time

 tools/build/Makefile.build   |   1 +
 tools/perf/arch/arm64/util/pmu.c |  11 +-
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/builtin-list.c|  20 +-
 tools/perf/builtin-stat.c|   1 -
 tools/perf/pmu-events/Build  |  16 +-
 tools/perf/pmu-events/empty-pmu-events.c | 108 ++-
 tools/perf/pmu-events/jevents.py | 357 +++
 tools/perf/pmu-events/metric.py  |  79 -
 tools/perf/pmu-events/metric_test.py |  15 +-
 tools/perf/pmu-events/pmu-events.h   |  26 +-
 tools/perf/tests/expand-cgroup.c |   4 +-
 tools/perf/tests/parse-metric.c  |   4 +-
 tools/perf/tests/pmu-events.c|  69 ++---
 tools/perf/util/cgroup.c |   1 -
 tools/perf/util/evsel.c  |   2 -
 tools/perf/util/evsel.h  |   2 -
 tools/perf/util/expr.h   |   1 +
 tools/perf/util/expr.l   |   8 +-
 tools/perf/util/metricgroup.c| 207 +++--
 tools/perf/util/metricgroup.h|   4 +-
 tools/perf/util/parse-events.c   |   2 -
 tools/perf/util/pmu.c|  44 +--
 tools/perf/util/pmu.h|  10 +-
 tools/perf/util/print-events.c   |  32 +-
 tools/perf/util/print-events.h   |   3 +-
 tools/perf/util/python.c |   7 -
 tools/perf/util/stat-shadow.c| 112 ---
 tools/perf/util/stat.h   |   1 -
 29 files changed, 681 insertions(+), 470 deletions(-)
 mode change 100644 => 100755 tools/perf/pmu-events/metric_test.py

-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v5 01/15] perf jevents metric: Correct Function equality

2023-01-26 Thread Ian Rogers
rhs may not be defined, say for source_count, so add a guard.

Reviewed-by: Kajol Jain
---
 tools/perf/pmu-events/metric.py | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 4797ed4fd817..2f2fd220e843 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -261,8 +261,10 @@ class Function(Expression):
 
   def Equals(self, other: Expression) -> bool:
 if isinstance(other, Function):
-  return self.fn == other.fn and self.lhs.Equals(
-  other.lhs) and self.rhs.Equals(other.rhs)
+  result = self.fn == other.fn and self.lhs.Equals(other.lhs)
+  if self.rhs:
+result = result and self.rhs.Equals(other.rhs)
+  return result
 return False
 
 
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v5 02/15] perf jevents metric: Add ability to rewrite metrics in terms of others

2023-01-26 Thread Ian Rogers
Add RewriteMetricsInTermsOfOthers that iterates over pairs of names
and expressions trying to replace an expression, within the current
expression, with its name.
---
 tools/perf/pmu-events/metric.py  | 73 +++-
 tools/perf/pmu-events/metric_test.py | 10 
 2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 2f2fd220e843..77ea6ff98538 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -4,7 +4,7 @@ import ast
 import decimal
 import json
 import re
-from typing import Dict, List, Optional, Set, Union
+from typing import Dict, List, Optional, Set, Tuple, Union
 
 
 class Expression:
@@ -26,6 +26,9 @@ class Expression:
 """Returns true when two expressions are the same."""
 raise NotImplementedError()
 
+  def Substitute(self, name: str, expression: 'Expression') -> 'Expression':
+raise NotImplementedError()
+
   def __str__(self) -> str:
 return self.ToPerfJson()
 
@@ -186,6 +189,15 @@ class Operator(Expression):
   other.lhs) and self.rhs.Equals(other.rhs)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Operator(self.operator, lhs, rhs)
+
 
 class Select(Expression):
   """Represents a select ternary in the parse tree."""
@@ -225,6 +237,14 @@ class Select(Expression):
   other.false_val) and self.true_val.Equals(other.true_val)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+true_val = self.true_val.Substitute(name, expression)
+cond = self.cond.Substitute(name, expression)
+false_val = self.false_val.Substitute(name, expression)
+return Select(true_val, cond, false_val)
+
 
 class Function(Expression):
   """A function in an expression like min, max, d_ratio."""
@@ -267,6 +287,15 @@ class Function(Expression):
   return result
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Function(self.fn, lhs, rhs)
+
 
 def _FixEscapes(s: str) -> str:
   s = re.sub(r'([^\\]),', r'\1\\,', s)
@@ -293,6 +322,9 @@ class Event(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Event) and self.name == other.name
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Constant(Expression):
   """A constant within the expression tree."""
@@ -317,6 +349,9 @@ class Constant(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Constant) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Literal(Expression):
   """A runtime literal within the expression tree."""
@@ -336,6 +371,9 @@ class Literal(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Literal) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 def min(lhs: Union[int, float, Expression], rhs: Union[int, float,
Expression]) -> 
Function:
@@ -461,6 +499,7 @@ class MetricGroup:
 
 
 class _RewriteIfExpToSelect(ast.NodeTransformer):
+  """Transformer to convert if-else nodes to Select expressions."""
 
   def visit_IfExp(self, node):
 # pylint: disable=invalid-name
@@ -498,7 +537,37 @@ def ParsePerfJson(orig: str) -> Expression:
   for kw in keywords:
 py = re.sub(rf'Event\(r"{kw}"\)', kw, py)
 
-  parsed = ast.parse(py, mode='eval')
+  try:
+parsed = ast.parse(py, mode='eval')
+  except SyntaxError as e:
+raise SyntaxError(f'Parsing expression:\n{orig}') from e
   _RewriteIfExpToSelect().visit(parsed)
   parsed = ast.fix_missing_locations(parsed)
   return _Constify(eval(compile(parsed, orig, 'eval')))
+
+
+def RewriteMetricsInTermsOfOthers(metrics: List[Tuple[str, Expression]]
+  )-> Dict[str, Expression]:
+  """Shorten metrics by rewriting in terms of others.
+
+  Args:
+metrics (list): pairs of metric names and their expressions.
+  Returns:
+Dict: mapping from a metric name to a shortened expression.
+  """
+  updates: Dict[str, Expression] = dict()
+  for outer_name, outer_expression in metrics:
+updated = outer_expression
+while True:
+  for inner_name, inner_expression in metrics:
+if inner_name.lower() == outer_name.lower():

[PATCH v5 03/15] perf jevents: Rewrite metrics in the same file with each other

2023-01-26 Thread Ian Rogers
Rewrite metrics within the same file in terms of each other. For example, on 
Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

To avoid recomputation decorate the function with a cache.
---
 tools/perf/pmu-events/jevents.py | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 0416b7442171..15a1671740cc 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -3,6 +3,7 @@
 """Convert directories of JSON events to C code."""
 import argparse
 import csv
+from functools import lru_cache
 import json
 import metric
 import os
@@ -337,18 +338,28 @@ class JsonEvent:
 s = self.build_c_string()
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
-
+@lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events from the specified file."""
-
   try:
-result = json.load(open(path), object_hook=JsonEvent)
+events = json.load(open(path), object_hook=JsonEvent)
   except BaseException as err:
 print(f"Exception processing {path}")
 raise
-  for event in result:
+  metrics: list[Tuple[str, metric.Expression]] = []
+  for event in events:
 event.topic = topic
-  return result
+if event.metric_name and '-' not in event.metric_name:
+  metrics.append((event.metric_name, event.metric_expr))
+  updates = metric.RewriteMetricsInTermsOfOthers(metrics)
+  if updates:
+for event in events:
+  if event.metric_name in updates:
+# print(f'Updated {event.metric_name} from\n"{event.metric_expr}"\n'
+#   f'to\n"{updates[event.metric_name]}"')
+event.metric_expr = updates[event.metric_name]
+
+  return events
 
 def preprocess_arch_std_files(archpath: str) -> None:
   """Read in all architecture standard events."""
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v5 04/15] perf pmu-events: Add separate metric from pmu_event

2023-01-26 Thread Ian Rogers
Create a new pmu_metric for the metric related variables from
pmu_event but that is initially just a clone of pmu_event. Add
iterators for pmu_metric and use in places that metrics are desired
rather than events. Make the event iterator skip metric only events,
and the metric iterator skip event only events.

Reviewed-by: John Garry 
---
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/pmu-events/empty-pmu-events.c |  49 ++-
 tools/perf/pmu-events/jevents.py |  62 -
 tools/perf/pmu-events/pmu-events.h   |  26 
 tools/perf/tests/pmu-events.c|  35 +++--
 tools/perf/util/metricgroup.c| 161 +++
 tools/perf/util/metricgroup.h|   2 +-
 7 files changed, 228 insertions(+), 111 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index e8fe36b10d20..78eef77d8a8d 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -40,11 +40,11 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
return bufp;
 }
 
-int arch_get_runtimeparam(const struct pmu_event *pe)
+int arch_get_runtimeparam(const struct pmu_metric *pm)
 {
int count;
char path[PATH_MAX] = "/devices/hv_24x7/interface/";
 
-   atoi(pe->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
+   atoi(pm->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
return sysfs__read_int(path, &count) < 0 ? 1 : count;
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 480e8f0d30c8..4e39d1a8d6d6 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -181,6 +181,11 @@ struct pmu_events_table {
const struct pmu_event *entries;
 };
 
+/* Struct used to make the PMU metric table implementation opaque to callers. 
*/
+struct pmu_metrics_table {
+   const struct pmu_metric *entries;
+};
+
 /*
  * Map a CPU to its table of PMU events. The CPU is identified by the
  * cpuid field, which is an arch-specific identifier for the CPU.
@@ -254,11 +259,29 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 int pmu_events_table_for_each_event(const struct pmu_events_table *table, 
pmu_event_iter_fn fn,
void *data)
 {
-   for (const struct pmu_event *pe = &table->entries[0];
-pe->name || pe->metric_group || pe->metric_name;
-pe++) {
-   int ret = fn(pe, table, data);
+   for (const struct pmu_event *pe = &table->entries[0]; pe->name || 
pe->metric_expr; pe++) {
+   int ret;
 
+   if (!pe->name)
+   continue;
+   ret = fn(pe, table, data);
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
+void *data)
+{
+   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
+
+   for (const struct pmu_metric *pm = &table->entries[0]; pm->name || 
pm->metric_expr; pm++) {
+   int ret;
+
+   if (!pm->metric_expr)
+   continue;
+   ret = fn(pm, etable, data);
if (ret)
return ret;
}
@@ -305,11 +328,22 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
 }
 
 int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data)
+{
+   for (const struct pmu_events_map *tables = &pmu_events_map[0]; 
tables->arch; tables++) {
+   int ret = pmu_events_table_for_each_event(&tables->table, fn, 
data);
+
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data)
 {
for (const struct pmu_events_map *tables = &pmu_events_map[0];
 tables->arch;
 tables++) {
-   int ret = pmu_events_table_for_each_event(&tables->table, fn, 
data);
+   int ret = pmu_events_table_for_each_metric(&tables->table, fn, 
data);
 
if (ret)
return ret;
@@ -340,3 +374,8 @@ int pmu_for_each_sys_event(pmu_event_iter_fn fn, void *data)
}
return 0;
 }
+
+int pmu_for_each_sys_metric(pmu_metric_iter_fn fn __maybe_unused, void *data 
__maybe_unused)
+{
+   return 0;
+}
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 15a1671740cc..858787a12302 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -564,7 +564,19 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 \t},
 };
 
-static void decompress(int offset, struct pmu_event *pe)
+static void decompress_event(int off

[PATCH v5 05/15] perf pmu-events: Separate the metrics from events for no jevents

2023-01-26 Thread Ian Rogers
Separate the event and metric table when building without jevents. Add
find_core_metrics_table and perf_pmu__find_metrics_table while
renaming existing utilities to be event specific, so that users can
find the right table for their need.

Reviewed-by: John Garry 
---
 tools/perf/pmu-events/empty-pmu-events.c | 88 ++--
 tools/perf/pmu-events/jevents.py |  7 +-
 tools/perf/pmu-events/pmu-events.h   |  4 +-
 tools/perf/tests/expand-cgroup.c |  2 +-
 tools/perf/tests/parse-metric.c  |  2 +-
 tools/perf/util/pmu.c|  4 +-
 6 files changed, 79 insertions(+), 28 deletions(-)

diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 4e39d1a8d6d6..10bd4943ebf8 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-static const struct pmu_event pme_test_soc_cpu[] = {
+static const struct pmu_event pmu_events__test_soc_cpu[] = {
{
.name = "l3_cache_rd",
.event = "event=0x40",
@@ -105,6 +105,14 @@ static const struct pmu_event pme_test_soc_cpu[] = {
.desc = "L2 BTB Correction",
.topic = "branch",
},
+   {
+   .name = 0,
+   .event = 0,
+   .desc = 0,
+   },
+};
+
+static const struct pmu_metric pmu_metrics__test_soc_cpu[] = {
{
.metric_expr= "1 / IPC",
.metric_name= "CPI",
@@ -170,9 +178,8 @@ static const struct pmu_event pme_test_soc_cpu[] = {
.metric_name= "L1D_Cache_Fill_BW",
},
{
-   .name = 0,
-   .event = 0,
-   .desc = 0,
+   .metric_expr = 0,
+   .metric_name = 0,
},
 };
 
@@ -197,7 +204,8 @@ struct pmu_metrics_table {
 struct pmu_events_map {
const char *arch;
const char *cpuid;
-   const struct pmu_events_table table;
+   const struct pmu_events_table event_table;
+   const struct pmu_metrics_table metric_table;
 };
 
 /*
@@ -208,12 +216,14 @@ static const struct pmu_events_map pmu_events_map[] = {
{
.arch = "testarch",
.cpuid = "testcpu",
-   .table = { pme_test_soc_cpu },
+   .event_table = { pmu_events__test_soc_cpu },
+   .metric_table = { pmu_metrics__test_soc_cpu },
},
{
.arch = 0,
.cpuid = 0,
-   .table = { 0 },
+   .event_table = { 0 },
+   .metric_table = { 0 },
},
 };
 
@@ -259,12 +269,9 @@ static const struct pmu_sys_events pmu_sys_event_tables[] 
= {
 int pmu_events_table_for_each_event(const struct pmu_events_table *table, 
pmu_event_iter_fn fn,
void *data)
 {
-   for (const struct pmu_event *pe = &table->entries[0]; pe->name || 
pe->metric_expr; pe++) {
-   int ret;
+   for (const struct pmu_event *pe = &table->entries[0]; pe->name; pe++) {
+   int ret = fn(pe, table, data);
 
-   if (!pe->name)
-   continue;
-   ret = fn(pe, table, data);
if (ret)
return ret;
}
@@ -276,19 +283,44 @@ int pmu_events_table_for_each_metric(const struct 
pmu_events_table *etable, pmu_
 {
struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
 
-   for (const struct pmu_metric *pm = &table->entries[0]; pm->name || 
pm->metric_expr; pm++) {
-   int ret;
+   for (const struct pmu_metric *pm = &table->entries[0]; pm->metric_expr; 
pm++) {
+   int ret = fn(pm, etable, data);
 
-   if (!pm->metric_expr)
-   continue;
-   ret = fn(pm, etable, data);
if (ret)
return ret;
}
return 0;
 }
 
-const struct pmu_events_table *perf_pmu__find_table(struct perf_pmu *pmu)
+const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu 
*pmu)
+{
+   const struct pmu_events_table *table = NULL;
+   char *cpuid = perf_pmu__getcpuid(pmu);
+   int i;
+
+   /* on some platforms which uses cpus map, cpuid can be NULL for
+* PMUs other than CORE PMUs.
+*/
+   if (!cpuid)
+   return NULL;
+
+   i = 0;
+   for (;;) {
+   const struct pmu_events_map *map = &pmu_events_map[i++];
+
+   if (!map->cpuid)
+   break;
+
+   if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
+   table = &map->event_table;
+   break;
+   }
+   }
+   free(cpuid);
+   return table;
+}
+
+const struct pmu_events_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
 {
const struct pmu_events_table *table = NULL

[PATCH v5 06/15] perf pmu-events: Remove now unused event and metric variables

2023-01-26 Thread Ian Rogers
Previous changes separated the uses of pmu_event and pmu_metric,
however, both structures contained all the variables of event and
metric. This change removes the event variables from metric and the
metric variables from event.

Note, this change removes the setting of evsel's metric_name/expr as
these fields are no longer part of struct pmu_event. The metric
remains but is no longer implicitly requested when the event is. This
impacts a few Intel uncore events, however, as the ScaleUnit is shared
by the event and the metric this utility is questionable. Also the
MetricNames look broken (contain spaces) in some cases and when trying
to use the functionality with '-e' the metrics fail but regular
metrics with '-M' work. For example, on SkylakeX '-M' works:

```
$ perf stat -M LLC_MISSES.PCIE_WRITE -a sleep 1

 Performance counter stats for 'system wide':

 0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 #  57896.0 
Bytes  LLC_MISSES.PCIE_WRITE  (49.84%)
 7,174  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 
   (49.85%)
 0  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 
   (50.16%)
63  UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 
   (50.15%)

   1.004576381 seconds time elapsed
```

whilst the event '-e' version is broken even with --group/-g (fwiw, we should 
also remove -g [1]):

```
$ perf stat -g -e LLC_MISSES.PCIE_WRITE -g -a sleep 1
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric 
expression for LLC_MISSES.PCIE_WRITE

 Performance counter stats for 'system wide':

27,316 Bytes LLC_MISSES.PCIE_WRITE

   1.004505469 seconds time elapsed
```

The code also carries warnings where the user is supposed to select
events for metrics [2] but given the lack of use of such a feature,
let's clean the code and just remove.

[1] https://lore.kernel.org/lkml/20220707195610.303254-1-irog...@google.com/
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/stat-shadow.c?id=01b8957b738f42f96a130079bc951b3cc78c5b8a#n425

Reviewed-by: John Garry 
---
 tools/perf/builtin-list.c  | 20 ++---
 tools/perf/pmu-events/jevents.py   | 20 +
 tools/perf/pmu-events/pmu-events.h | 22 +--
 tools/perf/tests/pmu-event

[PATCH v5 07/15] perf stat: Remove evsel metric_name/expr

2023-01-26 Thread Ian Rogers
Metrics are their own unit and these variables held broken metrics
previously and now just hold the value NULL. Remove code that used
these variables.

Reviewed-by: John Garry 
---
 tools/perf/builtin-stat.c |   1 -
 tools/perf/util/cgroup.c  |   1 -
 tools/perf/util/evsel.c   |   2 -
 tools/perf/util/evsel.h   |   2 -
 tools/perf/util/python.c  |   7 ---
 tools/perf/util/stat-shadow.c | 112 --
 tools/perf/util/stat.h|   1 -
 7 files changed, 126 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9f3e4b257516..5d18a5a6f662 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2524,7 +2524,6 @@ int cmd_stat(int argc, const char **argv)
&stat_config.metric_events);
zfree(&metrics);
}
-   perf_stat__collect_metric_expr(evsel_list);
perf_stat__init_shadow_stats();
 
if (add_default_attributes())
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index cd978c240e0d..bfb13306d82c 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -481,7 +481,6 @@ int evlist__expand_cgroup(struct evlist *evlist, const char 
*str,
nr_cgroups++;
 
if (metric_events) {
-   perf_stat__collect_metric_expr(tmp_list);
if (metricgroup__copy_metric_events(tmp_list, cgrp,
metric_events,

&orig_metric_events) < 0)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8550638587e5..a90e998826e0 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -285,8 +285,6 @@ void evsel__init(struct evsel *evsel,
evsel->sample_size = __evsel__sample_size(attr->sample_type);
evsel__calc_id_pos(evsel);
evsel->cmdline_group_boundary = false;
-   evsel->metric_expr   = NULL;
-   evsel->metric_name   = NULL;
evsel->metric_events = NULL;
evsel->per_pkg_mask  = NULL;
evsel->collect_stat  = false;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d572be41b960..24cb807ef6ce 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -105,8 +105,6 @@ struct evsel {
 * metric fields are similar, but needs more care as they can have
 * references to other metric (evsel).
 */
-   const char *metric_expr;
-   const char *metric_name;
struct evsel**metric_events;
struct evsel*metric_leader;
 
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 9e5d881b0987..42e8b813d010 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -76,13 +76,6 @@ const char *perf_env__arch(struct perf_env *env 
__maybe_unused)
return NULL;
 }
 
-/*
- * Add this one here not to drag util/stat-shadow.c
- */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-}
-
 /*
  * These ones are needed not to drag the PMU bandwagon, jevents generated
  * pmu_sys_event_tables, etc and evsel__find_pmu() is used so far just for
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index cadb2df23c87..35ea4813f468 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -346,114 +346,6 @@ static const char *get_ratio_color(enum grc_type type, 
double ratio)
return color;
 }
 
-static struct evsel *perf_stat__find_event(struct evlist *evsel_list,
-   const char *name)
-{
-   struct evsel *c2;
-
-   evlist__for_each_entry (evsel_list, c2) {
-   if (!strcasecmp(c2->name, name) && !c2->collect_stat)
-   return c2;
-   }
-   return NULL;
-}
-
-/* Mark MetricExpr target events and link events using them to them. */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-   struct evsel *counter, *leader, **metric_events, *oc;
-   bool found;
-   struct expr_parse_ctx *ctx;
-   struct hashmap_entry *cur;
-   size_t bkt;
-   int i;
-
-   ctx = expr__ctx_new();
-   if (!ctx) {
-   pr_debug("expr__ctx_new failed");
-   return;
-   }
-   evlist__for_each_entry(evsel_list, counter) {
-   bool invalid = false;
-
-   leader = evsel__leader(counter);
-   if (!counter->metric_expr)
-   continue;
-
-   expr__ctx_clear(ctx);
-   metric_events = counter->metric_events;
-   if (!metric_events) {
-   if (expr__find_ids(counter->metric_expr,
-  counter->name,
-  ctx) < 0)
-   continue;
-
- 

[PATCH v5 08/15] perf jevents: Combine table prefix and suffix writing

2023-01-26 Thread Ian Rogers
Combine into a single function to simplify, in a later change, writing
metrics separately.
---
 tools/perf/pmu-events/jevents.py | 36 +---
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 4cdbf34b7298..5f8d490c7269 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -19,10 +19,10 @@ _sys_event_tables = []
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
-# Track whether an events table is currently being defined and needs closing.
-_close_table = False
 # Events to write out when the table is closed
 _pending_events = []
+# Name of table to be written out
+_pending_events_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
@@ -378,24 +378,13 @@ def preprocess_arch_std_files(archpath: str) -> None:
   _arch_std_events[event.metric_name.lower()] = event
 
 
-def print_events_table_prefix(tblname: str) -> None:
-  """Called when a new events table is started."""
-  global _close_table
-  if _close_table:
-raise IOError('Printing table prefix but last table has no suffix')
-  _args.output_file.write(f'static const struct compact_pmu_event {tblname}[] 
= {{\n')
-  _close_table = True
-
-
 def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
   """Add contents of file to _pending_events table."""
-  if not _close_table:
-raise IOError('Table entries missing prefix')
   for e in read_json_events(item.path, topic):
 _pending_events.append(e)
 
 
-def print_events_table_suffix() -> None:
+def print_pending_events() -> None:
   """Optionally close events table."""
 
   def event_cmp_key(j: JsonEvent) -> Tuple[bool, str, str, str, str]:
@@ -407,17 +396,19 @@ def print_events_table_suffix() -> None:
 return (j.desc is not None, fix_none(j.topic), fix_none(j.name), 
fix_none(j.pmu),
 fix_none(j.metric_name))
 
-  global _close_table
-  if not _close_table:
+  global _pending_events
+  if not _pending_events:
 return
 
-  global _pending_events
+  global _pending_events_tblname
+  _args.output_file.write(
+  f'static const struct compact_pmu_event {_pending_events_tblname}[] = 
{{\n')
+
   for event in sorted(_pending_events, key=event_cmp_key):
 _args.output_file.write(event.to_c_string())
-_pending_events = []
+  _pending_events = []
 
   _args.output_file.write('};\n\n')
-  _close_table = False
 
 def get_topic(topic: str) -> str:
   if topic.endswith('metrics.json'):
@@ -455,12 +446,13 @@ def process_one_file(parents: Sequence[str], item: 
os.DirEntry) -> None:
 
   # model directory, reset topic
   if item.is_dir() and is_leaf_dir(item.path):
-print_events_table_suffix()
+print_pending_events()
 
 tblname = file_name_to_table_name(parents, item.name)
 if item.name == 'sys':
   _sys_event_tables.append(tblname)
-print_events_table_prefix(tblname)
+global _pending_events_tblname
+_pending_events_tblname = tblname
 return
 
   # base dir or too deep
@@ -809,7 +801,7 @@ struct compact_pmu_event {
   for arch in archs:
 arch_path = f'{_args.starting_dir}/{arch}'
 ftw(arch_path, [], process_one_file)
-print_events_table_suffix()
+print_pending_events()
 
   print_mapping_table(archs)
   print_system_mapping_table()
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v5 09/15] perf pmu-events: Introduce pmu_metrics_table

2023-01-26 Thread Ian Rogers
Add a metrics table that is just a cast from pmu_events_table. This
changes the APIs so that event and metric usage of the underlying
table is different. For the no jevents case the tables are already
separate, later changes will separate the tables for the jevents case.
---
 tools/perf/arch/arm64/util/pmu.c | 11 -
 tools/perf/pmu-events/empty-pmu-events.c | 21 -
 tools/perf/pmu-events/jevents.py | 21 ++---
 tools/perf/pmu-events/pmu-events.h   | 10 +++--
 tools/perf/tests/expand-cgroup.c |  2 +-
 tools/perf/tests/parse-metric.c  |  2 +-
 tools/perf/tests/pmu-events.c|  5 ++-
 tools/perf/util/metricgroup.c| 54 
 tools/perf/util/metricgroup.h|  2 +-
 tools/perf/util/pmu.c|  5 +++
 tools/perf/util/pmu.h|  1 +
 11 files changed, 78 insertions(+), 56 deletions(-)

diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
index 801bf52e2ea6..2779840d8896 100644
--- a/tools/perf/arch/arm64/util/pmu.c
+++ b/tools/perf/arch/arm64/util/pmu.c
@@ -22,7 +22,14 @@ static struct perf_pmu *pmu__find_core_pmu(void)
return NULL;
 
return pmu;
-   }
+}
+
+const struct pmu_metrics_table *pmu_metrics_table__find(void)
+{
+   struct perf_pmu *pmu = pmu__find_core_pmu();
+
+   if (pmu)
+   return perf_pmu__find_metrics_table(pmu);
 
return NULL;
 }
@@ -32,7 +39,7 @@ const struct pmu_events_table *pmu_events_table__find(void)
struct perf_pmu *pmu = pmu__find_core_pmu();
 
if (pmu)
-   return perf_pmu__find_table(pmu);
+   return perf_pmu__find_events_table(pmu);
 
return NULL;
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 10bd4943ebf8..a938b74cf487 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -278,13 +278,11 @@ int pmu_events_table_for_each_event(const struct 
pmu_events_table *table, pmu_ev
return 0;
 }
 
-int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
-void *data)
+int pmu_metrics_table_for_each_metric(const struct pmu_metrics_table *table, 
pmu_metric_iter_fn fn,
+ void *data)
 {
-   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
-
for (const struct pmu_metric *pm = &table->entries[0]; pm->metric_expr; 
pm++) {
-   int ret = fn(pm, etable, data);
+   int ret = fn(pm, table, data);
 
if (ret)
return ret;
@@ -320,9 +318,9 @@ const struct pmu_events_table 
*perf_pmu__find_events_table(struct perf_pmu *pmu)
return table;
 }
 
-const struct pmu_events_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
+const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
 {
-   const struct pmu_events_table *table = NULL;
+   const struct pmu_metrics_table *table = NULL;
char *cpuid = perf_pmu__getcpuid(pmu);
int i;
 
@@ -340,7 +338,7 @@ const struct pmu_events_table 
*perf_pmu__find_metrics_table(struct perf_pmu *pmu
break;
 
if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
-   table = (const struct pmu_events_table 
*)&map->metric_table;
+   table = &map->metric_table;
break;
}
}
@@ -359,13 +357,13 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
return NULL;
 }
 
-const struct pmu_events_table *find_core_metrics_table(const char *arch, const 
char *cpuid)
+const struct pmu_metrics_table *find_core_metrics_table(const char *arch, 
const char *cpuid)
 {
for (const struct pmu_events_map *tables = &pmu_events_map[0];
 tables->arch;
 tables++) {
if (!strcmp(tables->arch, arch) && 
!strcmp_cpuid_str(tables->cpuid, cpuid))
-   return (const struct pmu_events_table 
*)&tables->metric_table;
+   return &tables->metric_table;
}
return NULL;
 }
@@ -386,8 +384,7 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void 
*data)
for (const struct pmu_events_map *tables = &pmu_events_map[0];
 tables->arch;
 tables++) {
-   int ret = pmu_events_table_for_each_metric(
-   (const struct pmu_events_table *)&tables->metric_table, 
fn, data);
+   int ret = 
pmu_metrics_table_for_each_metric(&tables->metric_table, fn, data);
 
if (ret)
return ret;
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 5f8d490c7269..d83cc94af51f 100755
--- a/tools/pe

[PATCH v5 10/15] perf jevents: Generate metrics and events as separate tables

2023-01-26 Thread Ian Rogers
Turn a perf json event into an event, metric or both. This reduces the
number of events needed to scan to find an event or metric. As events
no longer need the relatively seldom used metric fields, 4 bytes is
saved per event. This reduces the big C string's size by 335kb (14.8%)
on x86.

Note, for the test PMU architecture pme_test_soc_cpu is renamed
pmu_events__test_soc_cpu for consistency with the event vs metric
naming convention.
---
 tools/perf/pmu-events/jevents.py | 244 +++
 tools/perf/tests/pmu-events.c|   3 +-
 2 files changed, 189 insertions(+), 58 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index d83cc94af51f..627ee817f57f 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -13,28 +13,40 @@ import collections
 
 # Global command line arguments.
 _args = None
+# List of regular event tables.
+_event_tables = []
 # List of event tables generated from "/sys" directories.
 _sys_event_tables = []
+# List of regular metric tables.
+_metric_tables = []
+# List of metric tables generated from "/sys" directories.
+_sys_metric_tables = []
+# Mapping between sys event table names and sys metric table names.
+_sys_event_table_to_metric_table_mapping = {}
 # Map from an event name to an architecture standard
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
 # Events to write out when the table is closed
 _pending_events = []
-# Name of table to be written out
+# Name of events table to be written out
 _pending_events_tblname = None
+# Metrics to write out when the table is closed
+_pending_metrics = []
+# Name of metrics table to be written out
+_pending_metrics_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
 _json_event_attributes = [
 # cmp_sevent related attributes.
-'name', 'pmu', 'topic', 'desc', 'metric_name', 'metric_group',
+'name', 'pmu', 'topic', 'desc',
 # Seems useful, put it early.
 'event',
 # Short things in alphabetical order.
 'aggr_mode', 'compat', 'deprecated', 'perpkg', 'unit',
 # Longer things (the last won't be iterated over during decompress).
-'metric_constraint', 'metric_expr', 'long_desc'
+'long_desc'
 ]
 
 # Attributes that are in pmu_metric rather than pmu_event.
@@ -52,14 +64,16 @@ def removesuffix(s: str, suffix: str) -> str:
   return s[0:-len(suffix)] if s.endswith(suffix) else s
 
 
-def file_name_to_table_name(parents: Sequence[str], dirname: str) -> str:
+def file_name_to_table_name(prefix: str, parents: Sequence[str],
+dirname: str) -> str:
   """Generate a C table name from directory names."""
-  tblname = 'pme'
+  tblname = prefix
   for p in parents:
 tblname += '_' + p
   tblname += '_' + dirname
   return tblname.replace('-', '_')
 
+
 def c_len(s: str) -> int:
   """Return the length of s a C string
 
@@ -277,7 +291,7 @@ class JsonEvent:
 self.metric_constraint = jd.get('MetricConstraint')
 self.metric_expr = None
 if 'MetricExpr' in jd:
-   self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
+  self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
 
 arch_std = jd.get('ArchStdEvent')
 if precise and self.desc and '(Precise Event)' not in self.desc:
@@ -326,23 +340,24 @@ class JsonEvent:
 s += f'\t{attr} = {value},\n'
 return s + '}'
 
-  def build_c_string(self) -> str:
+  def build_c_string(self, metric: bool) -> str:
 s = ''
-for attr in _json_event_attributes:
+for attr in _json_metric_attributes if metric else _json_event_attributes:
   x = getattr(self, attr)
-  if x and attr == 'metric_expr':
+  if metric and x and attr == 'metric_expr':
 # Convert parsed metric expressions into a string. Slashes
 # must be doubled in the file.
 x = x.ToPerfJson().replace('\\', '')
   s += f'{x}\\000' if x else '\\000'
 return s
 
-  def to_c_string(self) -> str:
+  def to_c_string(self, metric: bool) -> str:
 """Representation of the event as a C struct initializer."""
 
-s = self.build_c_string()
+s = self.build_c_string(metric)
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
+
 @lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events from the specified file."""
@@ -381,7 +396,10 @@ def preprocess_arch_std_files(archpath: str) -> None:
 def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
   """Add contents of file to _pending_events table."""
   for e in read_json_events(item.path, topic):
-_pending_events.append(e)
+if e.name:
+  _pending_events.append(e)
+if e.metric_name:
+  _pending_metrics.append(e)
 
 
 def print_pending_events() -> None:
@@ -401,15 +419,54 @@

[PATCH v5 11/15] perf jevents: Add model list option

2023-01-26 Thread Ian Rogers
This allows the set of generated jevents events and metrics be limited
to a subset of the model names. Appropriate if trying to minimize the
binary size where only a set of models are possible.
---
 tools/perf/pmu-events/Build  |  3 ++-
 tools/perf/pmu-events/jevents.py | 14 ++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index 15b9e8fdbffa..a14de24ecb69 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -10,6 +10,7 @@ JEVENTS_PY=  pmu-events/jevents.py
 ifeq ($(JEVENTS_ARCH),)
 JEVENTS_ARCH=$(SRCARCH)
 endif
+JEVENTS_MODEL ?= all
 
 #
 # Locate/process JSON files in pmu-events/arch/
@@ -23,5 +24,5 @@ $(OUTPUT)pmu-events/pmu-events.c: 
pmu-events/empty-pmu-events.c
 else
 $(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JSON_TEST) $(JEVENTS_PY) 
pmu-events/metric.py
$(call rule_mkdir)
-   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
pmu-events/arch $@
+   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
$(JEVENTS_MODEL) pmu-events/arch $@
 endif
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 627ee817f57f..2bcd07ce609f 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -599,6 +599,8 @@ const struct pmu_events_map pmu_events_map[] = {
 else:
   metric_tblname = 'NULL'
   metric_size = '0'
+if event_size == '0' and metric_size == '0':
+  continue
 cpuid = row[0].replace('\\', '')
 _args.output_file.write(f"""{{
 \t.arch = "{arch}",
@@ -888,12 +890,24 @@ def main() -> None:
   action: Callable[[Sequence[str], os.DirEntry], None]) -> None:
 """Replicate the directory/file walking behavior of C's file tree walk."""
 for item in os.scandir(path):
+  if _args.model != 'all' and item.is_dir():
+# Check if the model matches one in _args.model.
+if len(parents) == _args.model.split(',')[0].count('/'):
+  # We're testing the correct directory.
+  item_path = '/'.join(parents) + ('/' if len(parents) > 0 else '') + 
item.name
+  if 'test' not in item_path and item_path not in 
_args.model.split(','):
+continue
   action(parents, item)
   if item.is_dir():
 ftw(item.path, parents + [item.name], action)
 
   ap = argparse.ArgumentParser()
   ap.add_argument('arch', help='Architecture name like x86')
+  ap.add_argument('model', help='''Select a model such as skylake to
+reduce the code size.  Normally set to "all". For architectures like
+ARM64 with an implementor/model, the model must include the implementor
+such as "arm/cortex-a34".''',
+  default='all')
   ap.add_argument(
   'starting_dir',
   type=dir_path,
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v5 12/15] perf pmu-events: Fix testing with JEVENTS_ARCH=all

2023-01-26 Thread Ian Rogers
The #slots literal will return NAN when not on ARM64 which causes a
perf test failure when not on an ARM64 for a JEVENTS_ARCH=all build:
..
 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!
..
Add an is_test boolean so that the failure can be avoided when running
as a test.

Fixes: acef233b7ca7 ("perf pmu: Add #slots literal support for arm64")
---
 tools/perf/tests/pmu-events.c | 1 +
 tools/perf/util/expr.h| 1 +
 tools/perf/util/expr.l| 8 +---
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/tests/pmu-events.c b/tools/perf/tests/pmu-events.c
index 962c3c0d53ba..accf44b3d968 100644
--- a/tools/perf/tests/pmu-events.c
+++ b/tools/perf/tests/pmu-events.c
@@ -950,6 +950,7 @@ static int metric_parse_fake(const char *metric_name, const 
char *str)
pr_debug("expr__ctx_new failed");
return TEST_FAIL;
}
+   ctx->sctx.is_test = true;
if (expr__find_ids(str, NULL, ctx) < 0) {
pr_err("expr__find_ids failed\n");
return -1;
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index 029271540fb0..eaa44b24c555 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -9,6 +9,7 @@ struct expr_scanner_ctx {
char *user_requested_cpu_list;
int runtime;
bool system_wide;
+   bool is_test;
 };
 
 struct expr_parse_ctx {
diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
index 0168a9637330..72ff4f3d6d4b 100644
--- a/tools/perf/util/expr.l
+++ b/tools/perf/util/expr.l
@@ -84,9 +84,11 @@ static int literal(yyscan_t scanner, const struct 
expr_scanner_ctx *sctx)
YYSTYPE *yylval = expr_get_lval(scanner);
 
yylval->num = expr__get_literal(expr_get_text(scanner), sctx);
-   if (isnan(yylval->num))
-   return EXPR_ERROR;
-
+   if (isnan(yylval->num)) {
+   if (!sctx->is_test)
+   return EXPR_ERROR;
+   yylval->num = 1;
+   }
return LITERAL;
 }
 %}
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v5 13/15] perf jevents: Correct bad character encoding

2023-01-26 Thread Ian Rogers
A character encoding issue added a "3D" character that breaks the
metrics test.

Fixes: 40769665b63d ("perf jevents: Parse metrics during conversion")
---
 tools/perf/pmu-events/metric_test.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric_test.py 
b/tools/perf/pmu-events/metric_test.py
index ced5998bd827..e4c792428277 100644
--- a/tools/perf/pmu-events/metric_test.py
+++ b/tools/perf/pmu-events/metric_test.py
@@ -89,8 +89,8 @@ class TestMetricExpressions(unittest.TestCase):
 after = r'min((a + b if c > 1 else c + d), e + f)'
 self.assertEqual(ParsePerfJson(before).ToPerfJson(), after)
 
-before =3D r'a if b else c if d else e'
-after =3D r'(a if b else (c if d else e))'
+before = r'a if b else c if d else e'
+after = r'(a if b else (c if d else e))'
 self.assertEqual(ParsePerfJson(before).ToPerfJson(), after)
 
   def test_ToPython(self):
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v5 14/15] tools build: Add test echo-cmd

2023-01-26 Thread Ian Rogers
Add quiet_cmd_test so that:
$(Q)$(call echo-cmd,test)

will print:
TEST   

This is useful for executing compile-time tests similar to what
happens for fortify tests in the kernel's lib directory.
---
 tools/build/Makefile.build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build
index 715092fc6a23..89430338a3d9 100644
--- a/tools/build/Makefile.build
+++ b/tools/build/Makefile.build
@@ -53,6 +53,7 @@ build-file := $(dir)/Build
 
 quiet_cmd_flex  = FLEX$@
 quiet_cmd_bison = BISON   $@
+quiet_cmd_test  = TEST$@
 
 # Create directory unless it exists
 quiet_cmd_mkdir = MKDIR   $(dir $@)
-- 
2.39.1.456.gfc5497dd1b-goog



[PATCH v5 15/15] perf jevents: Run metric_test.py at compile-time

2023-01-26 Thread Ian Rogers
Add a target that generates a log file for running metric_test.py and
make this a dependency on generating pmu-events.c. The log output is
displayed if the test fails like (the test was modified to make it
fail):

```
  TEST/tmp/perf/pmu-events/metric_test.log
F..
==
FAIL: test_Brackets (__main__.TestMetricExpressions)
--
Traceback (most recent call last):
  File "tools/perf/pmu-events/metric_test.py", line 33, in test_Brackets
self.assertEqual((a * b + c).ToPerfJson(), 'a * b + d')
AssertionError: 'a * b + c' != 'a * b + d'
- a * b + c
? ^
+ a * b + d
? ^

--
Ran 7 tests in 0.004s

FAILED (failures=1)
make[3]: *** [pmu-events/Build:32: /tmp/perf/pmu-events/metric_test.log] Error 1
```

However, normal execution will just show the TEST line.

This is roughly modeled on fortify testing in the kernel lib directory.

Modify metric_test.py so that it is executable. This is necessary when
PYTHON isn't specified in the build, the normal case.

Use variables to make the paths to files clearer and more consistent.
---
 tools/perf/pmu-events/Build  | 13 +++--
 tools/perf/pmu-events/metric_test.py |  1 +
 2 files changed, 12 insertions(+), 2 deletions(-)
 mode change 100644 => 100755 tools/perf/pmu-events/metric_test.py

diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index a14de24ecb69..150765f2baee 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -6,6 +6,11 @@ JDIR_TEST  =  pmu-events/arch/test
 JSON_TEST  =  $(shell [ -d $(JDIR_TEST) ] &&   \
find $(JDIR_TEST) -name '*.json')
 JEVENTS_PY =  pmu-events/jevents.py
+METRIC_PY  =  pmu-events/metric.py
+METRIC_TEST_PY =  pmu-events/metric_test.py
+EMPTY_PMU_EVENTS_C = pmu-events/empty-pmu-events.c
+PMU_EVENTS_C   =  $(OUTPUT)pmu-events/pmu-events.c
+METRIC_TEST_LOG=  $(OUTPUT)pmu-events/metric_test.log
 
 ifeq ($(JEVENTS_ARCH),)
 JEVENTS_ARCH=$(SRCARCH)
@@ -18,11 +23,15 @@ JEVENTS_MODEL ?= all
 #
 
 ifeq ($(NO_JEVENTS),1)
-$(OUTPUT)pmu-events/pmu-events.c: pmu-events/empty-pmu-events.c
+$(PMU_EVENTS_C): $(EMPTY_PMU_EVENTS_C)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)cp $< $@
 else
-$(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JSON_TEST) $(JEVENTS_PY) 
pmu-events/metric.py
+$(METRIC_TEST_LOG): $(METRIC_TEST_PY) $(METRIC_PY)
+   $(call rule_mkdir)
+   $(Q)$(call echo-cmd,test)$(PYTHON) $< 2> $@ || (cat $@ && false)
+
+$(PMU_EVENTS_C): $(JSON) $(JSON_TEST) $(JEVENTS_PY) $(METRIC_PY) 
$(METRIC_TEST_LOG)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
$(JEVENTS_MODEL) pmu-events/arch $@
 endif
diff --git a/tools/perf/pmu-events/metric_test.py 
b/tools/perf/pmu-events/metric_test.py
old mode 100644
new mode 100755
index e4c792428277..40a3c7d8b2bc
--- a/tools/perf/pmu-events/metric_test.py
+++ b/tools/perf/pmu-events/metric_test.py
@@ -1,3 +1,4 @@
+#!/usr/bin/env python3
 # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
 import unittest
 from metric import Constant
-- 
2.39.1.456.gfc5497dd1b-goog



Re: [PATCH v5 00/15] jevents/pmu-events improvements

2023-01-27 Thread Ian Rogers
On Fri, Jan 27, 2023, 5:20 AM John Garry  wrote:

> On 26/01/2023 23:36, Ian Rogers wrote:
>
> Hi Ian,
>
> At a glance, none of this series has your Signed-off-by tag..
>
> Thanks,
> John
>


Thanks John, will fix. Is there anything else?

Ian

> Add an optimization to jevents using the metric code, rewrite metrics
> > in terms of each other in order to minimize size and improve
> > readability. For example, on Power8
> > other_stall_cpi is rewritten from:
> > "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU /
> PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL -
> PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU /
> PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL -
> PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
> > to:
> > "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi -
> lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi"
> > Which more closely matches the definition on Power9.
> >
> > A limitation of the substitutions are that they depend on strict
> > equality and the shape of the tree. This means that for "a + b + c"
> > then a substitution of "a + b" will succeed while "b + c" will fail
> > (the LHS for "+ c" is "a + b" not just "b").
> >
> > Separate out the events and metrics in the pmu-events tables saving
> > 14.8% in the table size while making it that metrics no longer need to
> > iterate over all events and vice versa. These changes remove evsel's
> > direct metric support as the pmu_event no longer has a metric to
> > populate it. This is a minor issue as the code wasn't working
> > properly, metrics for this are rare and can still be properly ran
> > using '-M'.
> >
> > Add an ability to just build certain models into the jevents generated
> > pmu-metrics.c code. This functionality is appropriate for operating
> > systems like ChromeOS, that aim to minimize binary size and know all
> > the target CPU models.
>
>


Re: [PATCH v5 10/15] perf jevents: Generate metrics and events as separate tables

2023-01-30 Thread Ian Rogers
On Mon, Jan 30, 2023 at 8:07 AM John Garry  wrote:
>
> On 26/01/2023 23:36, Ian Rogers wrote:
> > @@ -660,7 +763,29 @@ const struct pmu_events_table 
> > *perf_pmu__find_events_table(struct perf_pmu *pmu)
> >
> >   const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct 
> > perf_pmu *pmu)
> >   {
> > -return (struct pmu_metrics_table 
> > *)perf_pmu__find_events_table(pmu);
> > +const struct pmu_metrics_table *table = NULL;
> > +char *cpuid = perf_pmu__getcpuid(pmu);
> > +int i;
> > +
> > +/* on some platforms which uses cpus map, cpuid can be NULL for
> > + * PMUs other than CORE PMUs.
> > + */
> > +if (!cpuid)
> > +return NULL;
> > +
> > +i = 0;
> > +for (;;) {
> > +const struct pmu_events_map *map = &pmu_events_map[i++];
> > +if (!map->arch)
> > +break;
> > +
> > +if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
> > +table = &map->metric_table;
> > +break;
> > +}
> > +}
> > +free(cpuid);
> > +return table;
> >   }
>
> This is almost identical to generated perf_pmu__find_events_table(),
> except we return a pmu_metrics_table * (instead of a pmu_events_table *)
> and also return the metric table member (instead of event table). But
> the definitions are:
>
> /* Struct used to make the PMU event table implementation opaque to
> callers. */
> struct pmu_events_table {
>  const struct compact_pmu_event *entries;
>  size_t length;
> };
>
> /* Struct used to make the PMU metric table implementation opaque to
> callers. */
> struct pmu_metrics_table {
>  const struct compact_pmu_event *entries;
>  size_t length;
> };
>
> Those structs are defined to be the same thing, so I am failing to see
> the point in a) separate structure types b) why so much duplication
>
> As for b), I know that they are generated and the python code may be
> simpler this way (is it?), but still...

Agreed. The point is to separate the two tables for the typing at the
API layer, internally the representation is the same. When we decode
one we get a pmu_event and the other we get a pmu_metric, so we don't
want to allow the tables to be switched - hence two types.

Thanks,
Ian

> Thanks,
> John


Re: [PATCH v5 00/15] jevents/pmu-events improvements

2023-01-30 Thread Ian Rogers
On Mon, Jan 30, 2023 at 7:22 AM John Garry  wrote:
>
> On 27/01/2023 13:48, Ian Rogers wrote:
> > On Fri, Jan 27, 2023, 5:20 AM John Garry  > <mailto:john.g.ga...@oracle.com>> wrote:
> >
> > On 26/01/2023 23:36, Ian Rogers wrote:
> >
> > Hi Ian,
> >
> > At a glance, none of this series has your Signed-off-by tag..
> >
> > Thanks,
> > John
> >
> >
> >
> > Thanks John, will fix. Is there anything else?
>
> Do you think that pmu-events/__pycache__/metric.cpython-36.pyc should be
> deleted with a make clean? I would expect stuff like this to be deleted
> (with a clean), but I am not sure if we have a policy on this (pyc files)

Should they be covered by the existing clean target?
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/Makefile.perf?h=perf/core#n1102

Thanks,
Ian

> Thanks,
> John


[PATCH v1] perf pmu: Fix aarch64 build

2023-02-02 Thread Ian Rogers
ARM64 overrides a weak function but a previous change had broken the
build.

Fixes: 8cefeb8bd336 ("perf pmu-events: Introduce pmu_metrics_table")
Signed-off-by: Ian Rogers 
---
 tools/perf/arch/arm64/util/pmu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
index 2779840d8896..fa143acb4c8d 100644
--- a/tools/perf/arch/arm64/util/pmu.c
+++ b/tools/perf/arch/arm64/util/pmu.c
@@ -22,6 +22,8 @@ static struct perf_pmu *pmu__find_core_pmu(void)
return NULL;
 
return pmu;
+   }
+   return NULL;
 }
 
 const struct pmu_metrics_table *pmu_metrics_table__find(void)
-- 
2.39.1.519.gcb327c4b5f-goog



Re: [PATCH v1] perf pmu: Fix aarch64 build

2023-02-02 Thread Ian Rogers
On Thu, Feb 2, 2023 at 5:40 PM Ian Rogers  wrote:
>
> ARM64 overrides a weak function but a previous change had broken the
> build.
>
> Fixes: 8cefeb8bd336 ("perf pmu-events: Introduce pmu_metrics_table")

As 8cefeb8bd336 ("perf pmu-events: Introduce pmu_metrics_table") is
only on tmp.perf/core then it may be best to just squash this fix into
that.

Thanks,
Ian

> Signed-off-by: Ian Rogers 
> ---
>  tools/perf/arch/arm64/util/pmu.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/tools/perf/arch/arm64/util/pmu.c 
> b/tools/perf/arch/arm64/util/pmu.c
> index 2779840d8896..fa143acb4c8d 100644
> --- a/tools/perf/arch/arm64/util/pmu.c
> +++ b/tools/perf/arch/arm64/util/pmu.c
> @@ -22,6 +22,8 @@ static struct perf_pmu *pmu__find_core_pmu(void)
> return NULL;
>
> return pmu;
> +   }
> +   return NULL;
>  }
>
>  const struct pmu_metrics_table *pmu_metrics_table__find(void)
> --
> 2.39.1.519.gcb327c4b5f-goog
>


Re: [PATCH v1] perf pmu: Fix aarch64 build

2023-02-04 Thread Ian Rogers
On Fri, Feb 3, 2023 at 8:56 AM Arnaldo Carvalho de Melo  wrote:
>
> Em Fri, Feb 03, 2023 at 01:02:02PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Fri, Feb 03, 2023 at 12:43:48PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > I tried bisecting, but at this cset:
> > >
> > > acme@roc-rk3399-pc:~/git/perf$ git log --oneline -1
> > > d22e569cd33d (HEAD) perf pmu-events: Separate the metrics from events for 
> > > no jevents
> > > acme@roc-rk3399-pc:~/git/perf$
> > >
> > > I'm getting this:
> > >
> > >   CC  /tmp/build/perf/pmu-events/pmu-events.o
> > > pmu-events/pmu-events.c:3637:32: error: no previous prototype for 
> > > ‘perf_pmu__find_table’ [-Werror=missing-prototypes]
> > >  3637 | const struct pmu_events_table *perf_pmu__find_table(struct 
> > > perf_pmu *pmu)
> > >   |^~~~
> > >   CC  /tmp/build/perf/builtin-ftrace.o
> > >   CC  /tmp/build/perf/builtin-help.o
> > >   CC  /tmp/build/perf/builtin-buildid-list.o
> > > cc1: all warnings being treated as errors
> > > make[3]: *** [/home/acme/git/perf/tools/build/Makefile.build:97: 
> > > /tmp/build/perf/pmu-events/pmu-events.o] Error 1
> > > make[2]: *** [Makefile.perf:676: 
> > > /tmp/build/perf/pmu-events/pmu-events-in.o] Error 2
> > > make[2]: *** Waiting for unfinished jobs
> > >   CC  /tmp/build/perf/builtin-buildid-cache.o
> > >
> > > 
> > >
> > >   CC  /tmp/build/perf/tests/attr.o
> > > arch/arm64/util/pmu.c: In function ‘pmu_events_table__find’:
> > > arch/arm64/util/pmu.c:35:24: error: implicit declaration of function 
> > > ‘perf_pmu__find_table’; did you mean ‘perf_pmu__find_by_type’? 
> > > [-Werror=implicit-function-declaration]
> > >35 | return perf_pmu__find_table(pmu);
> > >   |^~~~
> > >   |perf_pmu__find_by_type
> > > arch/arm64/util/pmu.c:35:24: error: returning ‘int’ from a function with 
> > > return type ‘const struct pmu_events_table *’ makes pointer from integer 
> > > without a cast [-Werror=int-conversion]
> > >35 | return perf_pmu__find_table(pmu);
> > >   |^
> > > cc1: all warnings being treated as errors
> > > make[6]: *** [/home/acme/git/perf/tools/build/Makefile.build:97: 
> > > /tmp/build/perf/arch/arm64/util/pmu.o] Error 1
> > > make[5]: *** [/home/acme/git/perf/tools/build/Makefile.build:139: util] 
> > > Error 2
> > > make[4]: *** [/home/acme/git/perf/tools/build/Makefile.build:139: arm64] 
> > > Error 2
> > > make[3]: *** [/home/acme/git/perf/tools/build/Makefile.build:139: arch] 
> > > Error 2
> > > make[3]: *** Waiting for unfinished jobs
> > >   CC  /tmp/build/perf/tests/vmlinux-kallsyms.o
> > >
> > > -
> > >
> > > I'm building with:
> >
> > So:
> >
> > acme@roc-rk3399-pc:~/git/perf$ find tools/perf/ -name "*.[ch]" | xargs grep 
> > -w perf_pmu__find_table
> > tools/perf/arch/arm64/util/pmu.c: return 
> > perf_pmu__find_table(pmu);
> > tools/perf/pmu-events/pmu-events.c:const struct pmu_events_table 
> > *perf_pmu__find_table(struct perf_pmu *pmu)
> > acme@roc-rk3399-pc:~/git/perf$
> > acme@roc-rk3399-pc:~/git/perf$ git log --oneline -1
> > d22e569cd33d (HEAD) perf pmu-events: Separate the metrics from events for 
> > no jevents
> > acme@roc-rk3399-pc:~/git/perf$
> >
> > Tring to fix...
>
> tools/perf/pmu-events/pmu-events.c was a leftover from a previous build,
> strange as I build using O=, not to clutter the source dir, so perhaps
> handling that is missing, I'll check.
>
> Fixed aarch64 specific one with:
>
> diff --git a/tools/perf/arch/arm64/util/pmu.c 
> b/tools/perf/arch/arm64/util/pmu.c
> index 801bf52e2ea6..b4eaf00ec5a8 100644
> --- a/tools/perf/arch/arm64/util/pmu.c
> +++ b/tools/perf/arch/arm64/util/pmu.c
> @@ -32,7 +32,7 @@ const struct pmu_events_table *pmu_events_table__find(void)
> struct perf_pmu *pmu = pmu__find_core_pmu();
>
> if (pmu)
> -   return perf_pmu__find_table(pmu);
> +   return perf_pmu__find_events_table(pmu);
>
> return NULL;
>  }
>
>
> ---
>
> Continuing...

Thanks! Sorry for missing this one. Ideally we'd have less code under
arch/ . The previous error messages made me think you may need to
build clean.

Ian


Re: [PATCH v5 15/15] perf jevents: Run metric_test.py at compile-time

2023-02-04 Thread Ian Rogers
On Fri, Feb 3, 2023 at 12:15 PM Arnaldo Carvalho de Melo
 wrote:
>
> Em Thu, Jan 26, 2023 at 03:36:45PM -0800, Ian Rogers escreveu:
> > Add a target that generates a log file for running metric_test.py and
> > make this a dependency on generating pmu-events.c. The log output is
> > displayed if the test fails like (the test was modified to make it
> > fail):
> >
> > ```
> >   TEST/tmp/perf/pmu-events/metric_test.log
> > F..
> > ==
> > FAIL: test_Brackets (__main__.TestMetricExpressions)
> > --
> > Traceback (most recent call last):
> >   File "tools/perf/pmu-events/metric_test.py", line 33, in test_Brackets
> > self.assertEqual((a * b + c).ToPerfJson(), 'a * b + d')
> > AssertionError: 'a * b + c' != 'a * b + d'
> > - a * b + c
> > ? ^
> > + a * b + d
>
> Added this:
>
> diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
> index 05806ecfc33c12a1..f533e76fb48002b7 100644
> --- a/tools/perf/.gitignore
> +++ b/tools/perf/.gitignore
> @@ -38,6 +38,7 @@ arch/*/include/generated/
>  trace/beauty/generated/
>  pmu-events/pmu-events.c
>  pmu-events/jevents
> +pmu-events/metric_test.log
>  feature/
>  libapi/
>  libbpf/
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index b7d9c42062300d04..bac9272682b759e9 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -1103,6 +1103,7 @@ clean:: $(LIBAPI)-clean $(LIBBPF)-clean 
> $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean $(
> $(OUTPUT)util/intel-pt-decoder/inat-tables.c \
> $(OUTPUT)tests/llvm-src-{base,kbuild,prologue,relocation}.c \
> $(OUTPUT)pmu-events/pmu-events.c \
> +   $(OUTPUT)pmu-events/metric_test.log \
> $(OUTPUT)$(fadvise_advice_array) \
> $(OUTPUT)$(fsconfig_arrays) \
> $(OUTPUT)$(fsmount_arrays) \

Acked, thanks!

Ian


Re: [PATCH] tools/perf/tests: Add system wide check for perf bench workload in all metric test

2023-02-14 Thread Ian Rogers
On Tue, Feb 7, 2023 at 7:45 PM kajoljain  wrote:
>
>
>
> On 2/6/23 10:10, Athira Rajeev wrote:
> >
> >
> >> On 02-Feb-2023, at 10:14 PM, Kajol Jain  wrote:
> >>
> >> Testcase stat_all_metrics.sh fails in powerpc:
> >>
> >> 92: perf all metrics test : FAILED!
> >>
> >> Logs with verbose:
> >>
> >> [command]# ./perf test 92 -vv
> >> 92: perf all metrics test   :
> >> --- start ---
> >> test child forked, pid 13262
> >> Testing BRU_STALL_CPI
> >> Testing COMPLETION_STALL_CPI
> >> 
> >> Testing TOTAL_LOCAL_NODE_PUMPS_P23
> >> Metric 'TOTAL_LOCAL_NODE_PUMPS_P23' not printed in:
> >> Error:
> >> Invalid event (hv_24x7/PM_PB_LNS_PUMP23,chip=3/) in per-thread mode, 
> >> enable system wide with '-a'.
> >> Testing TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01
> >> Metric 'TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01' not printed in:
> >> Error:
> >> Invalid event (hv_24x7/PM_PB_RTY_LNS_PUMP01,chip=3/) in per-thread mode, 
> >> enable system wide with '-a'.
> >> 
> >>
> >> Based on above logs, we could see some of the hv-24x7 metric events fails,
> >> and logs suggest to run the metric event with -a option.
> >> This change happened after the commit a4b8cfcabb1d ("perf stat: Delay 
> >> metric
> >> parsing"), which delayed the metric parsing phase and now before metric 
> >> parsing
> >> phase perf tool identifies, whether target is system-wide or not. With this
> >> change, perf_event_open will fails with workload monitoring for uncore 
> >> events
> >> as expected.
> >>
> >> The perf all metric test case fails as some of the hv-24x7 metric events
> >> may need bigger workload to get the data. And the added perf bench
> >> workload in 'perf all metric test case' will not run for hv-24x7 without
> >> -a option.
> >>
> >> Fix this issue by adding system wide check for perf bench workload.
> >>
> >> Result with the patch changes in powerpc:
> >>
> >> 92: perf all metrics test : Ok
> >>
> >> Signed-off-by: Kajol Jain 
> >
> > Looks good to me
> >
> > Reviewed-by: Athira Rajeev 
>
> Hi Arnaldo,
>Let me know if patch looks fine to you.
>
> Thanks,
> Kajol Jain

I ran into a similar issue but worked around it with:

```
--- a/tools/perf/tests/shell/stat_all_metrics.sh
+++ b/tools/perf/tests/shell/stat_all_metrics.sh
@@ -11,7 +11,7 @@ for m in $(perf list --raw-dump metrics); do
continue
  fi
  # Failed so try system wide.
-  result=$(perf stat -M "$m" -a true 2>&1)
+  result=$(perf stat -M "$m" -a sleep 0.01 2>&1)
  if [[ "$result" =~ "${m:0:50}" ]]
  then
continue
```

Running the synthesize benchmark is potentially slow, wdyt of the change above?

Thanks,
Ian


> >
> >> ---
> >> tools/perf/tests/shell/stat_all_metrics.sh | 7 +++
> >> 1 file changed, 7 insertions(+)
> >>
> >> diff --git a/tools/perf/tests/shell/stat_all_metrics.sh 
> >> b/tools/perf/tests/shell/stat_all_metrics.sh
> >> index 6e79349e42be..d49832a316d9 100755
> >> --- a/tools/perf/tests/shell/stat_all_metrics.sh
> >> +++ b/tools/perf/tests/shell/stat_all_metrics.sh
> >> @@ -23,6 +23,13 @@ for m in $(perf list --raw-dump metrics); do
> >>   then
> >> continue
> >>   fi
> >> +  # Failed again, possibly the event is uncore pmu event which will need
> >> +  # system wide monitoring with workload, so retry with -a option
> >> +  result=$(perf stat -M "$m" -a perf bench internals synthesize 2>&1)
> >> +  if [[ "$result" =~ "${m:0:50}" ]]
> >> +  then
> >> +continue
> >> +  fi
> >>   echo "Metric '$m' not printed in:"
> >>   echo "$result"
> >>   if [[ "$err" != "1" ]]
> >> --
> >> 2.39.0
> >>
> >


Re: [PATCH v2] tools/perf/tests: Change true workload to sleep workload in all metric test for system wide check

2023-02-15 Thread Ian Rogers
On Wed, Feb 15, 2023 at 1:38 AM Kajol Jain  wrote:
>
> Testcase stat_all_metrics.sh fails in powerpc:
>
> 98: perf all metrics test : FAILED!
>
> Logs with verbose:
>
> [command]# ./perf test 98 -vv
>  98: perf all metrics test   :
>  --- start ---
> test child forked, pid 13262
> Testing BRU_STALL_CPI
> Testing COMPLETION_STALL_CPI
>  
> Testing TOTAL_LOCAL_NODE_PUMPS_P23
> Metric 'TOTAL_LOCAL_NODE_PUMPS_P23' not printed in:
> Error:
> Invalid event (hv_24x7/PM_PB_LNS_PUMP23,chip=3/) in per-thread mode, enable 
> system wide with '-a'.
> Testing TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01
> Metric 'TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01' not printed in:
> Error:
> Invalid event (hv_24x7/PM_PB_RTY_LNS_PUMP01,chip=3/) in per-thread mode, 
> enable system wide with '-a'.
>  
>
> Based on above logs, we could see some of the hv-24x7 metric events fails,
> and logs suggest to run the metric event with -a option.
> This change happened after the commit a4b8cfcabb1d ("perf stat: Delay metric
> parsing"), which delayed the metric parsing phase and now before metric 
> parsing
> phase perf tool identifies, whether target is system-wide or not. With this
> change, perf_event_open will fails with workload monitoring for uncore events
> as expected.
>
> The perf all metric test case fails as some of the hv-24x7 metric events
> may need bigger workload with system wide monitoring to get the data.
> Fix this issue by changing current system wide check from true workload to
> sleep 0.01 workload.
>
> Result with the patch changes in powerpc:
>
> 98: perf all metrics test : Ok
>
> Reviewed-by: Athira Rajeev 
> Tested-by: Disha Goel 
> Suggested-by: Ian Rogers 
> Signed-off-by: Kajol Jain 

Tested-by: Ian Rogers 

The mention of a4b8cfcabb1d  can be moved to a Fixes tag so that this
is backported.

Thanks,
Ian

> ---
> Changelog:
>
> v1->v2:
> - Addressed review comments from Ian, by changing true workload
>   to sleep workload in "perf all metric test". Rather then adding
>   new system wide check with perf bench workload.
> - Added Reviewed-by, Tested-by and Suggested-by tags.
>
>  tools/perf/tests/shell/stat_all_metrics.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/tests/shell/stat_all_metrics.sh 
> b/tools/perf/tests/shell/stat_all_metrics.sh
> index 6e79349e42be..22e9cb294b40 100755
> --- a/tools/perf/tests/shell/stat_all_metrics.sh
> +++ b/tools/perf/tests/shell/stat_all_metrics.sh
> @@ -11,7 +11,7 @@ for m in $(perf list --raw-dump metrics); do
>  continue
>fi
># Failed so try system wide.
> -  result=$(perf stat -M "$m" -a true 2>&1)
> +  result=$(perf stat -M "$m" -a sleep 0.01 2>&1)
>if [[ "$result" =~ "${m:0:50}" ]]
>then
>  continue
> --
> 2.39.1
>


Re: [PATCH] powerpc/perf: Add json metric events to present CPI stall cycles in powerpc

2023-02-16 Thread Ian Rogers
On Wed, Feb 15, 2023 at 10:12 PM Athira Rajeev
 wrote:
>
> Power10 Performance Monitoring Unit (PMU) provides events
> to understand stall cycles of different pipeline stages.
> These events along with completed instructions provides
> useful metrics for application tuning.
>
> Patch implements the json changes to collect counter statistics
> to present the high level CPI stall breakdown metrics. New metric
> group is named as "CPI_STALL_RATIO" and this new metric group
> presents these stall metrics:
> - DISPATCHED_CPI ( Dispatch stall cycles per insn )
> - ISSUE_STALL_CPI ( Issue stall cycles per insn )
> - EXECUTION_STALL_CPI ( Execution stall cycles per insn )
> - COMPLETION_STALL_CPI ( Completition stall cycles per insn )
>
> To avoid multipling of events, PM_RUN_INST_CMPL event has been
> modified to use PMC5(performance monitoring counter5) instead
> of PMC4. This change is needed, since completion stall event
> is using PMC4.
>
> Usage example:
>
>  ./perf stat --metric-no-group -M CPI_STALL_RATIO 
>
>  Performance counter stats for 'workload':
>
> 63,056,817,982  PM_CMPL_STALL# 0.28 
> COMPLETION_STALL_CPI
>  1,743,988,038,896  PM_ISSUE_STALL   # 7.73 
> ISSUE_STALL_CPI
>225,597,495,030  PM_RUN_INST_CMPL # 6.18 
> DISPATCHED_CPI
>   #37.48 
> EXECUTION_STALL_CPI
>  1,393,916,546,654  PM_DISP_STALL_CYC
>  8,455,376,836,463  PM_EXEC_STALL
>
> "--metric-no-group" is used for forcing PM_RUN_INST_CMPL to be scheduled
> in all group for more accuracy.
>
> Signed-off-by: Athira Rajeev 

Acked-by: Ian Rogers 

Thanks,
Ian

> ---
>  tools/perf/pmu-events/arch/powerpc/power10/metrics.json | 8 
>  tools/perf/pmu-events/arch/powerpc/power10/others.json  | 2 +-
>  2 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/metrics.json 
> b/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
> index b57526fa44f2..6f53583a0c62 100644
> --- a/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
> +++ b/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
> @@ -15,7 +15,7 @@
>  {
>  "BriefDescription": "Average cycles per completed instruction when 
> dispatch was stalled for any reason",
>  "MetricExpr": "PM_DISP_STALL_CYC / PM_RUN_INST_CMPL",
> -"MetricGroup": "CPI",
> +"MetricGroup": "CPI;CPI_STALL_RATIO",
>  "MetricName": "DISPATCHED_CPI"
>  },
>  {
> @@ -147,13 +147,13 @@
>  {
>  "BriefDescription": "Average cycles per completed instruction when 
> the NTC instruction has been dispatched but not issued for any reason",
>  "MetricExpr": "PM_ISSUE_STALL / PM_RUN_INST_CMPL",
> -"MetricGroup": "CPI",
> +"MetricGroup": "CPI;CPI_STALL_RATIO",
>  "MetricName": "ISSUE_STALL_CPI"
>  },
>  {
>  "BriefDescription": "Average cycles per completed instruction when 
> the NTC instruction is waiting to be finished in one of the execution units",
>  "MetricExpr": "PM_EXEC_STALL / PM_RUN_INST_CMPL",
> -"MetricGroup": "CPI",
> +"MetricGroup": "CPI;CPI_STALL_RATIO",
>  "MetricName": "EXECUTION_STALL_CPI"
>  },
>  {
> @@ -309,7 +309,7 @@
>  {
>  "BriefDescription": "Average cycles per completed instruction when 
> the NTC instruction cannot complete because the thread was blocked",
>  "MetricExpr": "PM_CMPL_STALL / PM_RUN_INST_CMPL",
> -"MetricGroup": "CPI",
> +"MetricGroup": "CPI;CPI_STALL_RATIO",
>  "MetricName": "COMPLETION_STALL_CPI"
>  },
>  {
> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/others.json 
> b/tools/perf/pmu-events/arch/powerpc/power10/others.json
> index 7d0de1a2860b..a771e4b6bec5 100644
> --- a/tools/perf/pmu-events/arch/powerpc/power10/others.json
> +++ b/tools/perf/pmu-events/arch/powerpc/power10/others.json
> @@ -265,7 +265,7 @@
>  "BriefDescription": "Load Missed L1, counted at finish time."
>},
>{
> -"EventCode": "0x400FA",
> +"EventCode": "0x500FA",
>  "EventName": "PM_RUN_INST_CMPL",
>  "BriefDescription": "Completed PowerPC instructions gated by the run 
> latch."
>}
> --
> 2.31.1
>


Re: perf tools power9 JSON files build breakage on ubuntu 18.04 cross build

2023-03-23 Thread Ian Rogers
On Thu, Mar 23, 2023 at 6:11 AM Arnaldo Carvalho de Melo
 wrote:
>
> Exception processing pmu-events/arch/powerpc/power9/other.json
> Traceback (most recent call last):
>   File "pmu-events/jevents.py", line 997, in 
> main()
>   File "pmu-events/jevents.py", line 979, in main
> ftw(arch_path, [], preprocess_one_file)
>   File "pmu-events/jevents.py", line 935, in ftw
> ftw(item.path, parents + [item.name], action)
>   File "pmu-events/jevents.py", line 933, in ftw
> action(parents, item)
>   File "pmu-events/jevents.py", line 514, in preprocess_one_file
> for event in read_json_events(item.path, topic):
>   File "pmu-events/jevents.py", line 388, in read_json_events
> events = json.load(open(path), object_hook=JsonEvent)
>   File "/usr/lib/python3.6/json/__init__.py", line 296, in load
> return loads(fp.read(),
>   File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
> return codecs.ascii_decode(input, self.errors)[0]
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 55090: 
> ordinal not in range(128)
>   CC  /tmp/build/perf/tests/expr.o
> pmu-events/Build:35: recipe for target 
> '/tmp/build/perf/pmu-events/pmu-events.c' failed
> make[3]: *** [/tmp/build/perf/pmu-events/pmu-events.c] Error 1
> make[3]: *** Deleting file '/tmp/build/perf/pmu-events/pmu-events.c'
> Makefile.perf:679: recipe for target 
> '/tmp/build/perf/pmu-events/pmu-events-in.o' failed
> make[2]: *** [/tmp/build/perf/pmu-events/pmu-events-in.o] Error 2
> make[2]: *** Waiting for unfinished jobs
>
>
> Now jevents is an opt-out feature so I'm noticing these problems.
>
> A similar fix for s390 was accepted today:

The JEVENTS_ARCH=all make option builds the s390 files even on x86.
I'm confused as to why that's been working before these fixes.

Thanks,
Ian

> https://lore.kernel.org/r/20230323122532.2305847-1-tmri...@linux.ibm.com
> https://lore.kernel.org/r/ZBwkl77/I31AQk12@osiris
> --
>
> - Arnaldo


Re: [PATCH] perf vendor events power9: Remove UTF-8 characters from json files

2023-03-28 Thread Ian Rogers
On Tue, Mar 28, 2023 at 4:30 AM Kajol Jain  wrote:
>
> Commit 3c22ba524304 ("perf vendor events powerpc: Update POWER9 events")
> added and updated power9 pmu json events. However some of the json
> events which are part of other.json and pipeline.json files,
> contains UTF-8 characters in their brief description.
> Having UTF-8 character could brakes the perf build on some distros.

nit: s/bakes/break/

> Fix this issue by removing the UTF-8 characters from other.json and
> pipeline.json files.
>
> Result without the fix patch:
> [command]# file -i pmu-events/arch/powerpc/power9/*
> pmu-events/arch/powerpc/power9/cache.json:  application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/floating-point.json: application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/frontend.json:   application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/marked.json: application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/memory.json: application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/metrics.json:application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/nest_metrics.json:   application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/other.json:  application/json; 
> charset=utf-8
> pmu-events/arch/powerpc/power9/pipeline.json:   application/json; 
> charset=utf-8
> pmu-events/arch/powerpc/power9/pmc.json:application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/translation.json:application/json; 
> charset=us-ascii
>
> Result with the fix patch:
>
> [command]# file -i pmu-events/arch/powerpc/power9/*
> pmu-events/arch/powerpc/power9/cache.json:  application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/floating-point.json: application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/frontend.json:   application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/marked.json: application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/memory.json: application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/metrics.json:application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/nest_metrics.json:   application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/other.json:  application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/pipeline.json:   application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/pmc.json:application/json; 
> charset=us-ascii
> pmu-events/arch/powerpc/power9/translation.json:application/json; 
> charset=us-ascii
>
> Fixes: 3c22ba524304 ("perf vendor events powerpc: Update POWER9 events")
> Reported-by: Arnaldo Carvalho de Melo 
> Link: https://lore.kernel.org/lkml/zbxp77deq7ikt...@kernel.org/
> Signed-off-by: Kajol Jain 

Acked-by: Ian Rogers 

Thanks,
Ian

> ---
>  tools/perf/pmu-events/arch/powerpc/power9/other.json| 4 ++--
>  tools/perf/pmu-events/arch/powerpc/power9/pipeline.json | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/pmu-events/arch/powerpc/power9/other.json 
> b/tools/perf/pmu-events/arch/powerpc/power9/other.json
> index 3f69422c21f9..f10bd554521a 100644
> --- a/tools/perf/pmu-events/arch/powerpc/power9/other.json
> +++ b/tools/perf/pmu-events/arch/powerpc/power9/other.json
> @@ -1417,7 +1417,7 @@
>{
>  "EventCode": "0x45054",
>  "EventName": "PM_FMA_CMPL",
> -"BriefDescription": "two flops operation completed (fmadd, fnmadd, 
> fmsub, fnmsub) Scalar instructions only. "
> +"BriefDescription": "two flops operation completed (fmadd, fnmadd, 
> fmsub, fnmsub) Scalar instructions only."
>},
>{
>  "EventCode": "0x201E8",
> @@ -2017,7 +2017,7 @@
>{
>  "EventCode": "0xC0BC",
>  "EventName": "PM_LSU_FLUSH_OTHER",
> -"BriefDescription": "Other LSU flushes including: Sync (sync ack from L2 
> caused search of LRQ for oldest snooped load, This will either signal a 
> Precise Flush of the oldest snooped loa or a Flush Next PPC); Data Valid 
> Flush Next (several cases of this, one example is store and reload are lined 
> up such that a store-hit-reload scenario exists and the CDF has already 
> launched and has gotten bad/stale data); Bad Data Valid Flush Next (might be 
> a few cases of this, one example is a larxa (D$ hit) return data and dval but 
> can't allocate to LMQ (LMQ full or other reason). Already gave d

Re: [PATCH 1/2] tools/perf: Fix printing os->prefix in CSV metrics output

2022-11-03 Thread Ian Rogers
On Wed, Nov 2, 2022 at 1:36 AM Athira Rajeev
 wrote:
>
>
>
> > On 18-Oct-2022, at 2:26 PM, Athira Rajeev  
> > wrote:
> >
> > Perf stat with CSV output option prints an extra empty
> > string as first field in metrics output line.
> > Sample output below:
> >
> >   # ./perf stat -x, --per-socket -a -C 1 ls
> >   S0,1,1.78,msec,cpu-clock,1785146,100.00,0.973,CPUs utilized
> >   S0,1,26,,context-switches,1781750,100.00,0.015,M/sec
> >   S0,1,1,,cpu-migrations,1780526,100.00,0.561,K/sec
> >   S0,1,1,,page-faults,1779060,100.00,0.561,K/sec
> >   S0,1,875807,,cycles,1769826,100.00,0.491,GHz
> >   S0,1,85281,,stalled-cycles-frontend,1767512,100.00,9.74,frontend 
> > cycles idle
> >   S0,1,576839,,stalled-cycles-backend,1766260,100.00,65.86,backend 
> > cycles idle
> >   S0,1,288430,,instructions,1762246,100.00,0.33,insn per cycle
> > > ,S0,1,,,2.00,stalled cycles per insn
> >
> > The above command line uses field separator as ","
> > via "-x," option and per-socket option displays
> > socket value as first field. But here the last line
> > for "stalled cycles per insn" has "," in the
> > beginning.
> >
> > Sample output using interval mode:
> >   # ./perf stat -I 1000 -x, --per-socket -a -C 1 ls
> >   0.001813453,S0,1,1.87,msec,cpu-clock,1872052,100.00,0.002,CPUs 
> > utilized
> >   0.001813453,S0,1,2,,context-switches,1868028,100.00,1.070,K/sec
> >   --
> >   0.001813453,S0,1,85379,,instructions,1856754,100.00,0.32,insn per 
> > cycle
> > > 0.001813453,,S0,1,,,1.34,stalled cycles per insn
> >
> > Above result also has an extra csv separator after
> > the timestamp. Patch addresses extra field separator
> > in the beginning of the metric output line.
> >
> > The counter stats are displayed by function
> > "perf_stat__print_shadow_stats" in code
> > "util/stat-shadow.c". While printing the stats info
> > for "stalled cycles per insn", function "new_line_csv"
> > is used as new_line callback.
> >
> > The new_line_csv function has check for "os->prefix"
> > and if prefix is not null, it will be printed along
> > with cvs separator.
> > Snippet from "new_line_csv":
> >   if (os->prefix)
> >   fprintf(os->fh, "%s%s", os->prefix, config->csv_sep);
> >
> > Here os->prefix gets printed followed by ","
> > which is the cvs separator. The os->prefix is
> > used in interval mode option ( -I ), to print
> > time stamp on every new line. But prefix is
> > already set to contain csv separator when used
> > in interval mode for csv option.
> >
> > Reference: Function "static void print_interval"
> > Snippet:
> >   sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, 
> > config->csv_sep);
> >
> > Also if prefix is not assigned (if not used with
> > -I option), it gets set to empty string.
> > Reference: function printout() in util/stat-display.c
> > Snippet:
> >   .prefix = prefix ? prefix : "",
> >
> > Since prefix already set to contain cvs_sep in interval
> > option, patch removes printing config->csv_sep in
> > new_line_csv function to avoid printing extra field.
> >
> > After the patch:
> >
> >   # ./perf stat -x, --per-socket -a -C 1 ls
> >   S0,1,2.04,msec,cpu-clock,2045202,100.00,1.013,CPUs utilized
> >   S0,1,2,,context-switches,2041444,100.00,979.289,/sec
> >   S0,1,0,,cpu-migrations,2040820,100.00,0.000,/sec
> >   S0,1,2,,page-faults,2040288,100.00,979.289,/sec
> >   S0,1,254589,,cycles,2036066,100.00,0.125,GHz
> >   S0,1,82481,,stalled-cycles-frontend,2032420,100.00,32.40,frontend 
> > cycles idle
> >   S0,1,113170,,stalled-cycles-backend,2031722,100.00,44.45,backend 
> > cycles idle
> >   S0,1,88766,,instructions,2030942,100.00,0.35,insn per cycle
> >   S0,1,,,1.27,stalled cycles per insn
> >
> > Fixes: 92a61f6412d3 ("perf stat: Implement CSV metrics output")
> > Reported-by: Disha Goel 
> > Signed-off-by: Athira Rajeev 
>
> Hi All,
>
> Looking for review comments for this change.

Hi,

Thanks for addressing issues in this code. What is the status of the
CSV output test following these changes?

I think going forward we need to move away from CSV and columns, to
something with structure like json. We also need to refactor this
code, trying to add meaning to a newline character is a bad strategy
and creates some unnatural things. To some extent this overlaps with
Namhyung's aggregation cleanup. There are also weirdnesses in
jevents/pmu-events, like the same ScaleUnit applying to a metric and
an event - why are metrics even parts of events?

Given the current code is wac-a-mole, and this is another whack, if
the testing is okay I think we should move forward with this change.

Thanks,
Ian




> Thanks
> Athira
>
> > ---
> > tools/perf/util/stat-display.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > index 5c47ee9963a7..879874a4bc07 100644
> > --- a/tools/perf/util/stat-dis

Re: [PATCH] perf test: Skip watchpoint tests if no watchpoints available

2022-11-22 Thread Ian Rogers
On Tue, Nov 22, 2022 at 11:19 AM Christophe Leroy
 wrote:
>
>
>
> Le 21/11/2022 à 11:27, Naveen N. Rao a écrit :
> > On IBM Power9, perf watchpoint tests fail since no hardware breakpoints
> > are available. Detect this by checking the error returned by
> > perf_event_open() and skip the tests in that case.
> >
> > Reported-by: Disha Goel 
> > Signed-off-by: Naveen N. Rao 
> > ---
> >   tools/perf/tests/wp.c | 12 +++-
> >   1 file changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/tools/perf/tests/wp.c b/tools/perf/tests/wp.c
> > index 56455da30341b4..cc8719609b19ea 100644
> > --- a/tools/perf/tests/wp.c
> > +++ b/tools/perf/tests/wp.c
> > @@ -59,8 +59,10 @@ static int __event(int wp_type, void *wp_addr, unsigned 
> > long wp_len)
> >   get__perf_event_attr(&attr, wp_type, wp_addr, wp_len);
> >   fd = sys_perf_event_open(&attr, 0, -1, -1,
> >perf_event_open_cloexec_flag());
> > - if (fd < 0)
> > + if (fd < 0) {
> > + fd = -errno;
> >   pr_debug("failed opening event %x\n", attr.bp_type);
> > + }
>
> Do you really need that ?
>
> Can't you directly check errno in the caller ?

errno is very easily clobbered and not clearly set on success - ie,
it'd be better not to do that.

Acked-by: Ian Rogers 

Thanks,
Ian

> >
> >   return fd;
> >   }
> > @@ -77,7 +79,7 @@ static int test__wp_ro(struct test_suite *test 
> > __maybe_unused,
> >
> >   fd = __event(HW_BREAKPOINT_R, (void *)&data1, sizeof(data1));
> >   if (fd < 0)
> > - return -1;
> > + return fd == -ENODEV ? TEST_SKIP : -1;
> >
> >   tmp = data1;
> >   WP_TEST_ASSERT_VAL(fd, "RO watchpoint", 1);
> > @@ -101,7 +103,7 @@ static int test__wp_wo(struct test_suite *test 
> > __maybe_unused,
> >
> >   fd = __event(HW_BREAKPOINT_W, (void *)&data1, sizeof(data1));
> >   if (fd < 0)
> > - return -1;
> > + return fd == -ENODEV ? TEST_SKIP : -1;
> >
> >   tmp = data1;
> >   WP_TEST_ASSERT_VAL(fd, "WO watchpoint", 0);
> > @@ -126,7 +128,7 @@ static int test__wp_rw(struct test_suite *test 
> > __maybe_unused,
> >   fd = __event(HW_BREAKPOINT_R | HW_BREAKPOINT_W, (void *)&data1,
> >sizeof(data1));
> >   if (fd < 0)
> > - return -1;
> > + return fd == -ENODEV ? TEST_SKIP : -1;
> >
> >   tmp = data1;
> >   WP_TEST_ASSERT_VAL(fd, "RW watchpoint", 1);
> > @@ -150,7 +152,7 @@ static int test__wp_modify(struct test_suite *test 
> > __maybe_unused, int subtest _
> >
> >   fd = __event(HW_BREAKPOINT_W, (void *)&data1, sizeof(data1));
> >   if (fd < 0)
> > - return -1;
> > + return fd == -ENODEV ? TEST_SKIP : -1;
> >
> >   data1 = tmp;
> >   WP_TEST_ASSERT_VAL(fd, "Modify watchpoint", 1);
> >
> > base-commit: 63a3bf5e8d9e79ce456c8f73d4395a5a51d841b1


[PATCH v1 0/9] jevents/pmu-events improvements

2022-12-12 Thread Ian Rogers
Add an optimization to jevents using the metric code, rewrite metrics
in terms of each other in order to minimize size and improve
readability. For example, on Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

A limitation of the substitutions are that they depend on strict
equality and the shape of the tree. This means that for "a + b + c"
then a substitution of "a + b" will succeed while "b + c" will fail
(the LHS for "+ c" is "a + b").

Separate out the events and metrics in the pmu-events tables saving
14.8% in the table size while making it that metrics no longer need to
iterate over all events and vice versa. These changes remove evsel's
direct metric support as the pmu_event no longer has a metric to
populate it. This is a minor issue as the code wasn't working
properly, metrics for this are rare and can still be properly ran
using '-M'.

Add an ability to just build certain models into the code. This
functionality is appropriate for operating systems like ChromeOS, that
aim to minimize binary size and know all the target CPU models.

Ian Rogers (9):
  perf jevents metric: Correct Function equality
  perf jevents metric: Add ability to rewrite metrics in terms of others
  perf jevents: Rewrite metrics in the same file with each other
  perf pmu-events: Separate metric out of pmu_event
  perf stat: Remove evsel metric_name/expr
  perf jevents: Combine table prefix and suffix writing
  perf pmu-events: Introduce pmu_metrics_table
  perf jevents: Generate metrics and events as separate tables
  perf jevents: Add model list option

 tools/perf/arch/arm64/util/pmu.c |  23 +-
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/builtin-list.c|  20 +-
 tools/perf/builtin-stat.c|   1 -
 tools/perf/pmu-events/Build  |   3 +-
 tools/perf/pmu-events/empty-pmu-events.c | 111 ++-
 tools/perf/pmu-events/jevents.py | 353 ++-
 tools/perf/pmu-events/metric.py  |  75 -
 tools/perf/pmu-events/metric_test.py |  10 +
 tools/perf/pmu-events/pmu-events.h   |  26 +-
 tools/perf/tests/expand-cgroup.c |   4 +-
 tools/perf/tests/parse-metric.c  |   4 +-
 tools/perf/tests/pmu-events.c|  68 ++---
 tools/perf/util/cgroup.c |   1 -
 tools/perf/util/evsel.c  |   2 -
 tools/perf/util/evsel.h  |   2 -
 tools/perf/util/metricgroup.c| 203 +++--
 tools/perf/util/metricgroup.h|   4 +-
 tools/perf/util/parse-events.c   |   2 -
 tools/perf/util/pmu.c|  44 +--
 tools/perf/util/pmu.h|  10 +-
 tools/perf/util/print-events.c   |  32 +-
 tools/perf/util/print-events.h   |   3 +-
 tools/perf/util/python.c |   7 -
 tools/perf/util/stat-shadow.c| 112 ---
 tools/perf/util/stat.h   |   1 -
 26 files changed, 663 insertions(+), 462 deletions(-)

-- 
2.39.0.rc1.256.g54fd8350bd-goog



[PATCH v1 1/9] perf jevents metric: Correct Function equality

2022-12-12 Thread Ian Rogers
rhs may not be defined, say for source_count, so add a guard.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index cc451a265751..1fa3478b9ab0 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -261,8 +261,10 @@ class Function(Expression):
 
   def Equals(self, other: Expression) -> bool:
 if isinstance(other, Function):
-  return self.fn == other.fn and self.lhs.Equals(
-  other.lhs) and self.rhs.Equals(other.rhs)
+  result = self.fn == other.fn and self.lhs.Equals(other.lhs)
+  if self.rhs:
+result = result and self.rhs.Equals(other.rhs)
+  return result
 return False
 
 
-- 
2.39.0.rc1.256.g54fd8350bd-goog



[PATCH v1 2/9] perf jevents metric: Add ability to rewrite metrics in terms of others

2022-12-12 Thread Ian Rogers
Add RewriteMetricsInTermsOfOthers that iterates over pairs of names
and expressions trying to replace an expression, within the current
expression, with its name.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py  | 69 +++-
 tools/perf/pmu-events/metric_test.py | 10 
 2 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 1fa3478b9ab0..8e4deb8e95e2 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -4,7 +4,7 @@ import ast
 import decimal
 import json
 import re
-from typing import Dict, List, Optional, Set, Union
+from typing import Dict, List, Optional, Set, Tuple, Union
 
 
 class Expression:
@@ -26,6 +26,9 @@ class Expression:
 """Returns true when two expressions are the same."""
 raise NotImplementedError()
 
+  def Substitute(self, name: str, expression: 'Expression') -> 'Expression':
+raise NotImplementedError()
+
   def __str__(self) -> str:
 return self.ToPerfJson()
 
@@ -186,6 +189,15 @@ class Operator(Expression):
   other.lhs) and self.rhs.Equals(other.rhs)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Operator(self.operator, lhs, rhs)
+
 
 class Select(Expression):
   """Represents a select ternary in the parse tree."""
@@ -225,6 +237,14 @@ class Select(Expression):
   other.false_val) and self.true_val.Equals(other.true_val)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+true_val = self.true_val.Substitute(name, expression)
+cond = self.cond.Substitute(name, expression)
+false_val = self.false_val.Substitute(name, expression)
+return Select(true_val, cond, false_val)
+
 
 class Function(Expression):
   """A function in an expression like min, max, d_ratio."""
@@ -267,6 +287,15 @@ class Function(Expression):
   return result
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Function(self.fn, lhs, rhs)
+
 
 def _FixEscapes(s: str) -> str:
   s = re.sub(r'([^\\]),', r'\1\\,', s)
@@ -293,6 +322,9 @@ class Event(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Event) and self.name == other.name
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Constant(Expression):
   """A constant within the expression tree."""
@@ -317,6 +349,9 @@ class Constant(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Constant) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Literal(Expression):
   """A runtime literal within the expression tree."""
@@ -336,6 +371,9 @@ class Literal(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Literal) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 def min(lhs: Union[int, float, Expression], rhs: Union[int, float,
Expression]) -> 
Function:
@@ -461,9 +499,11 @@ class MetricGroup:
 
 
 class _RewriteIfExpToSelect(ast.NodeTransformer):
+  """Transformer to convert if-else nodes to Select expressions."""
 
   def visit_IfExp(self, node):
 # pylint: disable=invalid-name
+self.generic_visit(node)
 call = ast.Call(
 func=ast.Name(id='Select', ctx=ast.Load()),
 args=[node.body, node.test, node.orelse],
@@ -501,3 +541,30 @@ def ParsePerfJson(orig: str) -> Expression:
   _RewriteIfExpToSelect().visit(parsed)
   parsed = ast.fix_missing_locations(parsed)
   return _Constify(eval(compile(parsed, orig, 'eval')))
+
+
+def RewriteMetricsInTermsOfOthers(metrics: list[Tuple[str, Expression]]
+  )-> Dict[str, Expression]:
+  """Shorten metrics by rewriting in terms of others.
+
+  Args:
+metrics (list): pairs of metric names and their expressions.
+  Returns:
+Dict: mapping from a metric name to a shortened ex

[PATCH v1 3/9] perf jevents: Rewrite metrics in the same file with each other

2022-12-12 Thread Ian Rogers
Rewrite metrics within the same file in terms of each other. For example, on 
Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

To avoid recomputation decorate the function with a cache.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 4c398e0eeb2f..229402565425 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -3,6 +3,7 @@
 """Convert directories of JSON events to C code."""
 import argparse
 import csv
+from functools import lru_cache
 import json
 import metric
 import os
@@ -337,18 +338,28 @@ class JsonEvent:
 s = self.build_c_string()
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
-
+@lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events from the specified file."""
-
   try:
-result = json.load(open(path), object_hook=JsonEvent)
+events = json.load(open(path), object_hook=JsonEvent)
   except BaseException as err:
 print(f"Exception processing {path}")
 raise
-  for event in result:
+  metrics: list[Tuple[str, metric.Expression]] = []
+  for event in events:
 event.topic = topic
-  return result
+if event.metric_name and '-' not in event.metric_name:
+  metrics.append((event.metric_name, event.metric_expr))
+  updates = metric.RewriteMetricsInTermsOfOthers(metrics)
+  if updates:
+for event in events:
+  if event.metric_name in updates:
+# print(f'Updated {event.metric_name} from\n"{event.metric_expr}"\n'
+#   f'to\n"{updates[event.metric_name]}"')
+event.metric_expr = updates[event.metric_name]
+
+  return events
 
 def preprocess_arch_std_files(archpath: str) -> None:
   """Read in all architecture standard events."""
-- 
2.39.0.rc1.256.g54fd8350bd-goog



[PATCH v1 4/9] perf pmu-events: Separate metric out of pmu_event

2022-12-12 Thread Ian Rogers
rnel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/stat-shadow.c?id=01b8957b738f42f96a130079bc951b3cc78c5b8a#n425

Signed-off-by: Ian Rogers 
---
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/builtin-list.c|  20 +--
 tools/perf/pmu-events/empty-pmu-events.c |  73 --
 tools/perf/pmu-events/jevents.py |  82 +++-
 tools/perf/pmu-events/pmu-events.h   |  20 ++-
 tools/perf/tests/pmu-events.c|  62 +++--
 tools/perf/util/metricgroup.c| 161 +++
 tools/perf/util/metricgroup.h|   2 +-
 tools/perf/util/parse-events.c   |   2 -
 tools/perf/util/pmu.c|  35 +
 tools/perf/util/pmu.h|   9 --
 tools/perf/util/print-events.c   |  32 ++---
 tools/perf/util/print-events.h   |   3 +-
 13 files changed, 266 insertions(+), 239 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index e8fe36b10d20..78eef77d8a8d 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -40,11 +40,11 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
return bufp;
 }
 
-int arch_get_runtimeparam(const struct pmu_event *pe)
+int arch_get_runtimeparam(const struct pmu_metric *pm)
 {
int count;
char path[PATH_MAX] = "/devices/hv_24x7/interface/";
 
-   atoi(pe->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
+   atoi(pm->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
return sysfs__read_int(path, &count) < 0 ? 1 : count;
 }
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 137d73edb541..791f513ae5b4 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -99,8 +99,7 @@ static void default_print_event(void *ps, const char 
*pmu_name, const char *topi
const char *scale_unit __maybe_unused,
bool deprecated, const char *event_type_desc,
const char *desc, const char *long_desc,
-   const char *encoding_desc,
-   const char *metric_name, const char 
*metric_expr)
+   const char *encoding_desc)
 {
struct print_state *print_state = ps;
int pos;
@@ -159,10 +158,6 @@ static void default_print_event(void *ps, const char 
*pmu_name, const char *topi
if (print_state->detailed && encoding_desc) {
printf("%*s", 8, "");
wordwrap(encoding_desc, 8, pager_get_columns(), 0);
-   if (metric_name)
-   printf(" MetricName: %s", metric_name);
-   if (metric_expr)
-   printf(" MetricExpr: %s", metric_expr);
putchar('\n');
}
 }
@@ -308,8 +303,7 @@ static void json_print_event(void *ps, const char 
*pmu_name, const char *topic,
 const char *scale_unit,
 bool deprecated, const char *event_type_desc,
 const char *desc, const char *long_desc,
-const char *encoding_desc,
-const char *metric_name, const char *metric_expr)
+const char *encoding_desc)
 {
struct json_print_state *print_state = ps;
bool need_sep = false;
@@ -366,16 +360,6 @@ static void json_print_event(void *ps, const char 
*pmu_name, const char *topic,
  encoding_desc);
need_sep = true;
}
-   if (metric_name) {
-   fix_escape_printf(&buf, "%s\t\"MetricName\": \"%S\"", need_sep 
? ",\n" : "",
- metric_name);
-   need_sep = true;
-   }
-   if (metric_expr) {
-   fix_escape_printf(&buf, "%s\t\"MetricExpr\": \"%S\"", need_sep 
? ",\n" : "",
- metric_expr);
-   need_sep = true;
-   }
printf("%s}", need_sep ? "\n" : "");
strbuf_release(&buf);
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 480e8f0d30c8..5572a4d1eddb 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-static const struct pmu_event pme_test_soc_cpu[] = {
+static const struct pmu_event pmu_events__test_soc_cpu[] = {
{
.name = "l3_cache_rd",
.event = "event=0x40",
@@ -105,

[PATCH v1 5/9] perf stat: Remove evsel metric_name/expr

2022-12-12 Thread Ian Rogers
Metrics are their own unit and these variables held broken metrics
previously and now just hold the value NULL. Remove code that used
these variables.

Signed-off-by: Ian Rogers 
---
 tools/perf/builtin-stat.c |   1 -
 tools/perf/util/cgroup.c  |   1 -
 tools/perf/util/evsel.c   |   2 -
 tools/perf/util/evsel.h   |   2 -
 tools/perf/util/python.c  |   7 ---
 tools/perf/util/stat-shadow.c | 112 --
 tools/perf/util/stat.h|   1 -
 7 files changed, 126 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index d040fbcdcc5a..ac83b86cb247 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2529,7 +2529,6 @@ int cmd_stat(int argc, const char **argv)
&stat_config.metric_events);
zfree(&metrics);
}
-   perf_stat__collect_metric_expr(evsel_list);
perf_stat__init_shadow_stats();
 
if (add_default_attributes())
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index e99b41f9be45..dc2db0ff7ab4 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -468,7 +468,6 @@ int evlist__expand_cgroup(struct evlist *evlist, const char 
*str,
nr_cgroups++;
 
if (metric_events) {
-   perf_stat__collect_metric_expr(tmp_list);
if (metricgroup__copy_metric_events(tmp_list, cgrp,
metric_events,

&orig_metric_events) < 0)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 77b2cf5a214e..49460a224b0d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -285,8 +285,6 @@ void evsel__init(struct evsel *evsel,
evsel->sample_size = __evsel__sample_size(attr->sample_type);
evsel__calc_id_pos(evsel);
evsel->cmdline_group_boundary = false;
-   evsel->metric_expr   = NULL;
-   evsel->metric_name   = NULL;
evsel->metric_events = NULL;
evsel->per_pkg_mask  = NULL;
evsel->collect_stat  = false;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d572be41b960..24cb807ef6ce 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -105,8 +105,6 @@ struct evsel {
 * metric fields are similar, but needs more care as they can have
 * references to other metric (evsel).
 */
-   const char *metric_expr;
-   const char *metric_name;
struct evsel**metric_events;
struct evsel*metric_leader;
 
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 7320f7f777fe..417db74fa04e 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -75,13 +75,6 @@ const char *perf_env__arch(struct perf_env *env 
__maybe_unused)
return NULL;
 }
 
-/*
- * Add this one here not to drag util/stat-shadow.c
- */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-}
-
 /*
  * This one is needed not to drag the PMU bandwagon, jevents generated
  * pmu_sys_event_tables, etc and evsel__find_pmu() is used so far just for
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 9bde9224a97c..35ea4813f468 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -346,114 +346,6 @@ static const char *get_ratio_color(enum grc_type type, 
double ratio)
return color;
 }
 
-static struct evsel *perf_stat__find_event(struct evlist *evsel_list,
-   const char *name)
-{
-   struct evsel *c2;
-
-   evlist__for_each_entry (evsel_list, c2) {
-   if (!strcasecmp(c2->name, name) && !c2->collect_stat)
-   return c2;
-   }
-   return NULL;
-}
-
-/* Mark MetricExpr target events and link events using them to them. */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-   struct evsel *counter, *leader, **metric_events, *oc;
-   bool found;
-   struct expr_parse_ctx *ctx;
-   struct hashmap_entry *cur;
-   size_t bkt;
-   int i;
-
-   ctx = expr__ctx_new();
-   if (!ctx) {
-   pr_debug("expr__ctx_new failed");
-   return;
-   }
-   evlist__for_each_entry(evsel_list, counter) {
-   bool invalid = false;
-
-   leader = evsel__leader(counter);
-   if (!counter->metric_expr)
-   continue;
-
-   expr__ctx_clear(ctx);
-   metric_events = counter->metric_events;
-   if (!metric_events) {
-   if (expr__find_ids(counter->metric_expr,
-  counter->name,
- 

[PATCH v1 6/9] perf jevents: Combine table prefix and suffix writing

2022-12-12 Thread Ian Rogers
Combine into a single function to simplify, in a later change, writing
metrics separately.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 36 +---
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index ee3d4cdf01be..7b9714b25d0a 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -19,10 +19,10 @@ _sys_event_tables = []
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
-# Track whether an events table is currently being defined and needs closing.
-_close_table = False
 # Events to write out when the table is closed
 _pending_events = []
+# Name of table to be written out
+_pending_events_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
@@ -376,24 +376,13 @@ def preprocess_arch_std_files(archpath: str) -> None:
   _arch_std_events[event.name.lower()] = event
 
 
-def print_events_table_prefix(tblname: str) -> None:
-  """Called when a new events table is started."""
-  global _close_table
-  if _close_table:
-raise IOError('Printing table prefix but last table has no suffix')
-  _args.output_file.write(f'static const struct compact_pmu_event {tblname}[] 
= {{\n')
-  _close_table = True
-
-
 def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
   """Add contents of file to _pending_events table."""
-  if not _close_table:
-raise IOError('Table entries missing prefix')
   for e in read_json_events(item.path, topic):
 _pending_events.append(e)
 
 
-def print_events_table_suffix() -> None:
+def print_pending_events() -> None:
   """Optionally close events table."""
 
   def event_cmp_key(j: JsonEvent) -> Tuple[bool, str, str, str, str]:
@@ -405,17 +394,19 @@ def print_events_table_suffix() -> None:
 return (j.desc is not None, fix_none(j.topic), fix_none(j.name), 
fix_none(j.pmu),
 fix_none(j.metric_name))
 
-  global _close_table
-  if not _close_table:
+  global _pending_events
+  if not _pending_events:
 return
 
-  global _pending_events
+  global _pending_events_tblname
+  _args.output_file.write(
+  f'static const struct compact_pmu_event {_pending_events_tblname}[] = 
{{\n')
+
   for event in sorted(_pending_events, key=event_cmp_key):
 _args.output_file.write(event.to_c_string())
-_pending_events = []
+  _pending_events = []
 
   _args.output_file.write('};\n\n')
-  _close_table = False
 
 def get_topic(topic: str) -> str:
   if topic.endswith('metrics.json'):
@@ -453,12 +444,13 @@ def process_one_file(parents: Sequence[str], item: 
os.DirEntry) -> None:
 
   # model directory, reset topic
   if item.is_dir() and is_leaf_dir(item.path):
-print_events_table_suffix()
+print_pending_events()
 
 tblname = file_name_to_table_name(parents, item.name)
 if item.name == 'sys':
   _sys_event_tables.append(tblname)
-print_events_table_prefix(tblname)
+global _pending_events_tblname
+_pending_events_tblname = tblname
 return
 
   # base dir or too deep
@@ -802,7 +794,7 @@ struct compact_pmu_event {
   for arch in archs:
 arch_path = f'{_args.starting_dir}/{arch}'
 ftw(arch_path, [], process_one_file)
-print_events_table_suffix()
+print_pending_events()
 
   print_mapping_table(archs)
   print_system_mapping_table()
-- 
2.39.0.rc1.256.g54fd8350bd-goog



[PATCH v1 7/9] perf pmu-events: Introduce pmu_metrics_table

2022-12-12 Thread Ian Rogers
Add a metrics table that is just a cast from pmu_events_table. This
changes the APIs so that event and metric usage of the underlying
table is different. Later changes will separate the tables.

This introduction fixes a NO_JEVENTS=1 regression on:
 68: Parse and process metrics   : Ok
 70: Event expansion for cgroups : Ok
caused by the necessary test metrics not being found.

Signed-off-by: Ian Rogers 
---
 tools/perf/arch/arm64/util/pmu.c | 23 ++-
 tools/perf/pmu-events/empty-pmu-events.c | 52 
 tools/perf/pmu-events/jevents.py | 24 ---
 tools/perf/pmu-events/pmu-events.h   | 10 +++--
 tools/perf/tests/expand-cgroup.c |  4 +-
 tools/perf/tests/parse-metric.c  |  4 +-
 tools/perf/tests/pmu-events.c|  5 ++-
 tools/perf/util/metricgroup.c| 50 +++
 tools/perf/util/metricgroup.h|  2 +-
 tools/perf/util/pmu.c|  9 +++-
 tools/perf/util/pmu.h|  1 +
 11 files changed, 133 insertions(+), 51 deletions(-)

diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
index 477e513972a4..f8ae479a06db 100644
--- a/tools/perf/arch/arm64/util/pmu.c
+++ b/tools/perf/arch/arm64/util/pmu.c
@@ -19,7 +19,28 @@ const struct pmu_events_table *pmu_events_table__find(void)
if (pmu->cpus->nr != cpu__max_cpu().cpu)
return NULL;
 
-   return perf_pmu__find_table(pmu);
+   return perf_pmu__find_events_table(pmu);
+   }
+
+   return NULL;
+}
+
+const struct pmu_metrics_table *pmu_metrics_table__find(void)
+{
+   struct perf_pmu *pmu = NULL;
+
+   while ((pmu = perf_pmu__scan(pmu))) {
+   if (!is_pmu_core(pmu->name))
+   continue;
+
+   /*
+* The cpumap should cover all CPUs. Otherwise, some CPUs may
+* not support some events or have different event IDs.
+*/
+   if (pmu->cpus->nr != cpu__max_cpu().cpu)
+   return NULL;
+
+   return perf_pmu__find_metrics_table(pmu);
}
 
return NULL;
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 5572a4d1eddb..d50f60a571dd 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -278,14 +278,12 @@ int pmu_events_table_for_each_event(const struct 
pmu_events_table *table, pmu_ev
return 0;
 }
 
-int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
-void *data)
+int pmu_metrics_table_for_each_metric(const struct pmu_metrics_table *table, 
pmu_metric_iter_fn fn,
+ void *data)
 {
-   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
-
for (const struct pmu_metric *pm = &table->entries[0]; pm->metric_group 
|| pm->metric_name;
 pm++) {
-   int ret = fn(pm, etable, data);
+   int ret = fn(pm, table, data);
 
if (ret)
return ret;
@@ -293,7 +291,7 @@ int pmu_events_table_for_each_metric(const struct 
pmu_events_table *etable, pmu_
return 0;
 }
 
-const struct pmu_events_table *perf_pmu__find_table(struct perf_pmu *pmu)
+const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu 
*pmu)
 {
const struct pmu_events_table *table = NULL;
char *cpuid = perf_pmu__getcpuid(pmu);
@@ -321,6 +319,34 @@ const struct pmu_events_table *perf_pmu__find_table(struct 
perf_pmu *pmu)
return table;
 }
 
+const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
+{
+   const struct pmu_metrics_table *table = NULL;
+   char *cpuid = perf_pmu__getcpuid(pmu);
+   int i;
+
+   /* on some platforms which uses cpus map, cpuid can be NULL for
+* PMUs other than CORE PMUs.
+*/
+   if (!cpuid)
+   return NULL;
+
+   i = 0;
+   for (;;) {
+   const struct pmu_events_map *map = &pmu_events_map[i++];
+
+   if (!map->cpuid)
+   break;
+
+   if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
+   table = &map->metric_table;
+   break;
+   }
+   }
+   free(cpuid);
+   return table;
+}
+
 const struct pmu_events_table *find_core_events_table(const char *arch, const 
char *cpuid)
 {
for (const struct pmu_events_map *tables = &pmu_events_map[0];
@@ -332,6 +358,17 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
return NULL;
 }
 
+const struct pmu_metrics_table *find_core_metrics_table(con

[PATCH v1 8/9] perf jevents: Generate metrics and events as separate tables

2022-12-12 Thread Ian Rogers
Turn a perf json event into an event, metric or both. This reduces the
number of events needed to scan to find an event or metric. As events
no longer need the relatively seldom used metric fields, 4 bytes is
saved per event. This reduces the big C string's size by 335kb (14.8%)
on x86.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 244 +++
 tools/perf/tests/pmu-events.c|   3 +-
 2 files changed, 189 insertions(+), 58 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index be2cf8a8779c..c98443319145 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -13,28 +13,40 @@ import collections
 
 # Global command line arguments.
 _args = None
+# List of regular event tables.
+_event_tables = []
 # List of event tables generated from "/sys" directories.
 _sys_event_tables = []
+# List of regular metric tables.
+_metric_tables = []
+# List of metric tables generated from "/sys" directories.
+_sys_metric_tables = []
+# Mapping between sys event table names and sys metric table names.
+_sys_event_table_to_metric_table_mapping = {}
 # Map from an event name to an architecture standard
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
 # Events to write out when the table is closed
 _pending_events = []
-# Name of table to be written out
+# Name of events table to be written out
 _pending_events_tblname = None
+# Metrics to write out when the table is closed
+_pending_metrics = []
+# Name of metrics table to be written out
+_pending_metrics_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
 _json_event_attributes = [
 # cmp_sevent related attributes.
-'name', 'pmu', 'topic', 'desc', 'metric_name', 'metric_group',
+'name', 'pmu', 'topic', 'desc',
 # Seems useful, put it early.
 'event',
 # Short things in alphabetical order.
 'aggr_mode', 'compat', 'deprecated', 'perpkg', 'unit',
 # Longer things (the last won't be iterated over during decompress).
-'metric_constraint', 'metric_expr', 'long_desc'
+'long_desc'
 ]
 
 # Attributes that are in pmu_metric rather than pmu_event.
@@ -52,14 +64,16 @@ def removesuffix(s: str, suffix: str) -> str:
   return s[0:-len(suffix)] if s.endswith(suffix) else s
 
 
-def file_name_to_table_name(parents: Sequence[str], dirname: str) -> str:
+def file_name_to_table_name(prefix: str, parents: Sequence[str],
+dirname: str) -> str:
   """Generate a C table name from directory names."""
-  tblname = 'pme'
+  tblname = prefix
   for p in parents:
 tblname += '_' + p
   tblname += '_' + dirname
   return tblname.replace('-', '_')
 
+
 def c_len(s: str) -> int:
   """Return the length of s a C string
 
@@ -277,7 +291,7 @@ class JsonEvent:
 self.metric_constraint = jd.get('MetricConstraint')
 self.metric_expr = None
 if 'MetricExpr' in jd:
-   self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
+  self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
 
 arch_std = jd.get('ArchStdEvent')
 if precise and self.desc and '(Precise Event)' not in self.desc:
@@ -326,23 +340,24 @@ class JsonEvent:
 s += f'\t{attr} = {value},\n'
 return s + '}'
 
-  def build_c_string(self) -> str:
+  def build_c_string(self, metric: bool) -> str:
 s = ''
-for attr in _json_event_attributes:
+for attr in _json_metric_attributes if metric else _json_event_attributes:
   x = getattr(self, attr)
-  if x and attr == 'metric_expr':
+  if metric and x and attr == 'metric_expr':
 # Convert parsed metric expressions into a string. Slashes
 # must be doubled in the file.
 x = x.ToPerfJson().replace('\\', '')
   s += f'{x}\\000' if x else '\\000'
 return s
 
-  def to_c_string(self) -> str:
+  def to_c_string(self, metric: bool) -> str:
 """Representation of the event as a C struct initializer."""
 
-s = self.build_c_string()
+s = self.build_c_string(metric)
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
+
 @lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events

[PATCH v1 9/9] perf jevents: Add model list option

2022-12-12 Thread Ian Rogers
This allows the set of generated jevents events and metrics be limited
to a subset of the model names. Appropriate if trying to minimize the
binary size where only a set of models are possible.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/Build  | 3 ++-
 tools/perf/pmu-events/jevents.py | 4 
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index 15b9e8fdbffa..a14de24ecb69 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -10,6 +10,7 @@ JEVENTS_PY=  pmu-events/jevents.py
 ifeq ($(JEVENTS_ARCH),)
 JEVENTS_ARCH=$(SRCARCH)
 endif
+JEVENTS_MODEL ?= all
 
 #
 # Locate/process JSON files in pmu-events/arch/
@@ -23,5 +24,5 @@ $(OUTPUT)pmu-events/pmu-events.c: 
pmu-events/empty-pmu-events.c
 else
 $(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JSON_TEST) $(JEVENTS_PY) 
pmu-events/metric.py
$(call rule_mkdir)
-   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
pmu-events/arch $@
+   $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) 
$(JEVENTS_MODEL) pmu-events/arch $@
 endif
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index c98443319145..e9eba51e8557 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -886,12 +886,16 @@ def main() -> None:
   action: Callable[[Sequence[str], os.DirEntry], None]) -> None:
 """Replicate the directory/file walking behavior of C's file tree walk."""
 for item in os.scandir(path):
+  if (len(parents) == 0 and item.is_dir() and _args.model != 'all' and
+  'test' not in item.name and item.name not in _args.model.split(',')):
+continue
   action(parents, item)
   if item.is_dir():
 ftw(item.path, parents + [item.name], action)
 
   ap = argparse.ArgumentParser()
   ap.add_argument('arch', help='Architecture name like x86')
+  ap.add_argument('model', help='Model such as skylake, normally "all"', 
default='all')
   ap.add_argument(
   'starting_dir',
   type=dir_path,
-- 
2.39.0.rc1.256.g54fd8350bd-goog



[PATCH v2 0/9] jevents/pmu-events improvements

2022-12-21 Thread Ian Rogers
Add an optimization to jevents using the metric code, rewrite metrics
in terms of each other in order to minimize size and improve
readability. For example, on Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

A limitation of the substitutions are that they depend on strict
equality and the shape of the tree. This means that for "a + b + c"
then a substitution of "a + b" will succeed while "b + c" will fail
(the LHS for "+ c" is "a + b" not just "b").

Separate out the events and metrics in the pmu-events tables saving
14.8% in the table size while making it that metrics no longer need to
iterate over all events and vice versa. These changes remove evsel's
direct metric support as the pmu_event no longer has a metric to
populate it. This is a minor issue as the code wasn't working
properly, metrics for this are rare and can still be properly ran
using '-M'.

Add an ability to just build certain models into the jevents generated
pmu-metrics.c code. This functionality is appropriate for operating
systems like ChromeOS, that aim to minimize binary size and know all
the target CPU models.

v2. Rebase. Modify the code that skips rewriting a metric with the
same name with itself, to make the name check case insensitive.

Ian Rogers (9):
  perf jevents metric: Correct Function equality
  perf jevents metric: Add ability to rewrite metrics in terms of others
  perf jevents: Rewrite metrics in the same file with each other
  perf pmu-events: Separate metric out of pmu_event
  perf stat: Remove evsel metric_name/expr
  perf jevents: Combine table prefix and suffix writing
  perf pmu-events: Introduce pmu_metrics_table
  perf jevents: Generate metrics and events as separate tables
  perf jevents: Add model list option

 tools/perf/arch/arm64/util/pmu.c |  23 +-
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/builtin-list.c|  20 +-
 tools/perf/builtin-stat.c|   1 -
 tools/perf/pmu-events/Build  |   3 +-
 tools/perf/pmu-events/empty-pmu-events.c | 111 ++-
 tools/perf/pmu-events/jevents.py | 353 ++-
 tools/perf/pmu-events/metric.py  |  79 -
 tools/perf/pmu-events/metric_test.py |  10 +
 tools/perf/pmu-events/pmu-events.h   |  26 +-
 tools/perf/tests/expand-cgroup.c |   4 +-
 tools/perf/tests/parse-metric.c  |   4 +-
 tools/perf/tests/pmu-events.c|  68 ++---
 tools/perf/util/cgroup.c |   1 -
 tools/perf/util/evsel.c  |   2 -
 tools/perf/util/evsel.h  |   2 -
 tools/perf/util/metricgroup.c| 203 +++--
 tools/perf/util/metricgroup.h|   4 +-
 tools/perf/util/parse-events.c   |   2 -
 tools/perf/util/pmu.c|  44 +--
 tools/perf/util/pmu.h|  10 +-
 tools/perf/util/print-events.c   |  32 +-
 tools/perf/util/print-events.h   |   3 +-
 tools/perf/util/python.c |   7 -
 tools/perf/util/stat-shadow.c| 112 ---
 tools/perf/util/stat.h   |   1 -
 26 files changed, 666 insertions(+), 463 deletions(-)

-- 
2.39.0.314.g84b9a713c41-goog



[PATCH v2 1/9] perf jevents metric: Correct Function equality

2022-12-21 Thread Ian Rogers
rhs may not be defined, say for source_count, so add a guard.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 4797ed4fd817..2f2fd220e843 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -261,8 +261,10 @@ class Function(Expression):
 
   def Equals(self, other: Expression) -> bool:
 if isinstance(other, Function):
-  return self.fn == other.fn and self.lhs.Equals(
-  other.lhs) and self.rhs.Equals(other.rhs)
+  result = self.fn == other.fn and self.lhs.Equals(other.lhs)
+  if self.rhs:
+result = result and self.rhs.Equals(other.rhs)
+  return result
 return False
 
 
-- 
2.39.0.314.g84b9a713c41-goog



[PATCH v2 2/9] perf jevents metric: Add ability to rewrite metrics in terms of others

2022-12-21 Thread Ian Rogers
Add RewriteMetricsInTermsOfOthers that iterates over pairs of names
and expressions trying to replace an expression, within the current
expression, with its name.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/metric.py  | 73 +++-
 tools/perf/pmu-events/metric_test.py | 10 
 2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 2f2fd220e843..ed13efac7389 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -4,7 +4,7 @@ import ast
 import decimal
 import json
 import re
-from typing import Dict, List, Optional, Set, Union
+from typing import Dict, List, Optional, Set, Tuple, Union
 
 
 class Expression:
@@ -26,6 +26,9 @@ class Expression:
 """Returns true when two expressions are the same."""
 raise NotImplementedError()
 
+  def Substitute(self, name: str, expression: 'Expression') -> 'Expression':
+raise NotImplementedError()
+
   def __str__(self) -> str:
 return self.ToPerfJson()
 
@@ -186,6 +189,15 @@ class Operator(Expression):
   other.lhs) and self.rhs.Equals(other.rhs)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Operator(self.operator, lhs, rhs)
+
 
 class Select(Expression):
   """Represents a select ternary in the parse tree."""
@@ -225,6 +237,14 @@ class Select(Expression):
   other.false_val) and self.true_val.Equals(other.true_val)
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+true_val = self.true_val.Substitute(name, expression)
+cond = self.cond.Substitute(name, expression)
+false_val = self.false_val.Substitute(name, expression)
+return Select(true_val, cond, false_val)
+
 
 class Function(Expression):
   """A function in an expression like min, max, d_ratio."""
@@ -267,6 +287,15 @@ class Function(Expression):
   return result
 return False
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+if self.Equals(expression):
+  return Event(name)
+lhs = self.lhs.Substitute(name, expression)
+rhs = None
+if self.rhs:
+  rhs = self.rhs.Substitute(name, expression)
+return Function(self.fn, lhs, rhs)
+
 
 def _FixEscapes(s: str) -> str:
   s = re.sub(r'([^\\]),', r'\1\\,', s)
@@ -293,6 +322,9 @@ class Event(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Event) and self.name == other.name
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Constant(Expression):
   """A constant within the expression tree."""
@@ -317,6 +349,9 @@ class Constant(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Constant) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 class Literal(Expression):
   """A runtime literal within the expression tree."""
@@ -336,6 +371,9 @@ class Literal(Expression):
   def Equals(self, other: Expression) -> bool:
 return isinstance(other, Literal) and self.value == other.value
 
+  def Substitute(self, name: str, expression: Expression) -> Expression:
+return self
+
 
 def min(lhs: Union[int, float, Expression], rhs: Union[int, float,
Expression]) -> 
Function:
@@ -461,6 +499,7 @@ class MetricGroup:
 
 
 class _RewriteIfExpToSelect(ast.NodeTransformer):
+  """Transformer to convert if-else nodes to Select expressions."""
 
   def visit_IfExp(self, node):
 # pylint: disable=invalid-name
@@ -498,7 +537,37 @@ def ParsePerfJson(orig: str) -> Expression:
   for kw in keywords:
 py = re.sub(rf'Event\(r"{kw}"\)', kw, py)
 
-  parsed = ast.parse(py, mode='eval')
+  try:
+parsed = ast.parse(py, mode='eval')
+  except SyntaxError as e:
+raise SyntaxError(f'Parsing expression:\n{orig}') from e
   _RewriteIfExpToSelect().visit(parsed)
   parsed = ast.fix_missing_locations(parsed)
   return _Constify(eval(compile(parsed, orig, 'eval')))
+
+
+def RewriteMetricsInTermsOfOthers(metrics: list[Tuple[str, Expression]]
+  )-> Dict[str, Expression]:
+  """Shorten metrics by rewriting in terms of others.
+
+  Args:
+metrics 

[PATCH v2 3/9] perf jevents: Rewrite metrics in the same file with each other

2022-12-21 Thread Ian Rogers
Rewrite metrics within the same file in terms of each other. For example, on 
Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL 
- PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / 
PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi 
- ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

To avoid recomputation decorate the function with a cache.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 4c398e0eeb2f..229402565425 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -3,6 +3,7 @@
 """Convert directories of JSON events to C code."""
 import argparse
 import csv
+from functools import lru_cache
 import json
 import metric
 import os
@@ -337,18 +338,28 @@ class JsonEvent:
 s = self.build_c_string()
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
-
+@lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events from the specified file."""
-
   try:
-result = json.load(open(path), object_hook=JsonEvent)
+events = json.load(open(path), object_hook=JsonEvent)
   except BaseException as err:
 print(f"Exception processing {path}")
 raise
-  for event in result:
+  metrics: list[Tuple[str, metric.Expression]] = []
+  for event in events:
 event.topic = topic
-  return result
+if event.metric_name and '-' not in event.metric_name:
+  metrics.append((event.metric_name, event.metric_expr))
+  updates = metric.RewriteMetricsInTermsOfOthers(metrics)
+  if updates:
+for event in events:
+  if event.metric_name in updates:
+# print(f'Updated {event.metric_name} from\n"{event.metric_expr}"\n'
+#   f'to\n"{updates[event.metric_name]}"')
+event.metric_expr = updates[event.metric_name]
+
+  return events
 
 def preprocess_arch_std_files(archpath: str) -> None:
   """Read in all architecture standard events."""
-- 
2.39.0.314.g84b9a713c41-goog



[PATCH v2 4/9] perf pmu-events: Separate metric out of pmu_event

2022-12-21 Thread Ian Rogers
rnel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/stat-shadow.c?id=01b8957b738f42f96a130079bc951b3cc78c5b8a#n425

Signed-off-by: Ian Rogers 
---
 tools/perf/arch/powerpc/util/header.c|   4 +-
 tools/perf/builtin-list.c|  20 +--
 tools/perf/pmu-events/empty-pmu-events.c |  73 --
 tools/perf/pmu-events/jevents.py |  82 +++-
 tools/perf/pmu-events/pmu-events.h   |  20 ++-
 tools/perf/tests/pmu-events.c|  62 +++--
 tools/perf/util/metricgroup.c| 161 +++
 tools/perf/util/metricgroup.h|   2 +-
 tools/perf/util/parse-events.c   |   2 -
 tools/perf/util/pmu.c|  35 +
 tools/perf/util/pmu.h|   9 --
 tools/perf/util/print-events.c   |  32 ++---
 tools/perf/util/print-events.h   |   3 +-
 13 files changed, 266 insertions(+), 239 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index e8fe36b10d20..78eef77d8a8d 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -40,11 +40,11 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
return bufp;
 }
 
-int arch_get_runtimeparam(const struct pmu_event *pe)
+int arch_get_runtimeparam(const struct pmu_metric *pm)
 {
int count;
char path[PATH_MAX] = "/devices/hv_24x7/interface/";
 
-   atoi(pe->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
+   atoi(pm->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, 
"coresperchip");
return sysfs__read_int(path, &count) < 0 ? 1 : count;
 }
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 137d73edb541..791f513ae5b4 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -99,8 +99,7 @@ static void default_print_event(void *ps, const char 
*pmu_name, const char *topi
const char *scale_unit __maybe_unused,
bool deprecated, const char *event_type_desc,
const char *desc, const char *long_desc,
-   const char *encoding_desc,
-   const char *metric_name, const char 
*metric_expr)
+   const char *encoding_desc)
 {
struct print_state *print_state = ps;
int pos;
@@ -159,10 +158,6 @@ static void default_print_event(void *ps, const char 
*pmu_name, const char *topi
if (print_state->detailed && encoding_desc) {
printf("%*s", 8, "");
wordwrap(encoding_desc, 8, pager_get_columns(), 0);
-   if (metric_name)
-   printf(" MetricName: %s", metric_name);
-   if (metric_expr)
-   printf(" MetricExpr: %s", metric_expr);
putchar('\n');
}
 }
@@ -308,8 +303,7 @@ static void json_print_event(void *ps, const char 
*pmu_name, const char *topic,
 const char *scale_unit,
 bool deprecated, const char *event_type_desc,
 const char *desc, const char *long_desc,
-const char *encoding_desc,
-const char *metric_name, const char *metric_expr)
+const char *encoding_desc)
 {
struct json_print_state *print_state = ps;
bool need_sep = false;
@@ -366,16 +360,6 @@ static void json_print_event(void *ps, const char 
*pmu_name, const char *topic,
  encoding_desc);
need_sep = true;
}
-   if (metric_name) {
-   fix_escape_printf(&buf, "%s\t\"MetricName\": \"%S\"", need_sep 
? ",\n" : "",
- metric_name);
-   need_sep = true;
-   }
-   if (metric_expr) {
-   fix_escape_printf(&buf, "%s\t\"MetricExpr\": \"%S\"", need_sep 
? ",\n" : "",
- metric_expr);
-   need_sep = true;
-   }
printf("%s}", need_sep ? "\n" : "");
strbuf_release(&buf);
 }
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 480e8f0d30c8..5572a4d1eddb 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-static const struct pmu_event pme_test_soc_cpu[] = {
+static const struct pmu_event pmu_events__test_soc_cpu[] = {
{
.name = "l3_cache_rd",
.event = "event=0x40",
@@ -105,

[PATCH v2 5/9] perf stat: Remove evsel metric_name/expr

2022-12-21 Thread Ian Rogers
Metrics are their own unit and these variables held broken metrics
previously and now just hold the value NULL. Remove code that used
these variables.

Signed-off-by: Ian Rogers 
---
 tools/perf/builtin-stat.c |   1 -
 tools/perf/util/cgroup.c  |   1 -
 tools/perf/util/evsel.c   |   2 -
 tools/perf/util/evsel.h   |   2 -
 tools/perf/util/python.c  |   7 ---
 tools/perf/util/stat-shadow.c | 112 --
 tools/perf/util/stat.h|   1 -
 7 files changed, 126 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9f3e4b257516..5d18a5a6f662 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2524,7 +2524,6 @@ int cmd_stat(int argc, const char **argv)
&stat_config.metric_events);
zfree(&metrics);
}
-   perf_stat__collect_metric_expr(evsel_list);
perf_stat__init_shadow_stats();
 
if (add_default_attributes())
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index e99b41f9be45..dc2db0ff7ab4 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -468,7 +468,6 @@ int evlist__expand_cgroup(struct evlist *evlist, const char 
*str,
nr_cgroups++;
 
if (metric_events) {
-   perf_stat__collect_metric_expr(tmp_list);
if (metricgroup__copy_metric_events(tmp_list, cgrp,
metric_events,

&orig_metric_events) < 0)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 999dd1700502..4d198529911a 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -285,8 +285,6 @@ void evsel__init(struct evsel *evsel,
evsel->sample_size = __evsel__sample_size(attr->sample_type);
evsel__calc_id_pos(evsel);
evsel->cmdline_group_boundary = false;
-   evsel->metric_expr   = NULL;
-   evsel->metric_name   = NULL;
evsel->metric_events = NULL;
evsel->per_pkg_mask  = NULL;
evsel->collect_stat  = false;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d572be41b960..24cb807ef6ce 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -105,8 +105,6 @@ struct evsel {
 * metric fields are similar, but needs more care as they can have
 * references to other metric (evsel).
 */
-   const char *metric_expr;
-   const char *metric_name;
struct evsel**metric_events;
struct evsel*metric_leader;
 
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 212031b97910..8c450b3d0080 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -75,13 +75,6 @@ const char *perf_env__arch(struct perf_env *env 
__maybe_unused)
return NULL;
 }
 
-/*
- * Add this one here not to drag util/stat-shadow.c
- */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-}
-
 /*
  * This one is needed not to drag the PMU bandwagon, jevents generated
  * pmu_sys_event_tables, etc and evsel__find_pmu() is used so far just for
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index cadb2df23c87..35ea4813f468 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -346,114 +346,6 @@ static const char *get_ratio_color(enum grc_type type, 
double ratio)
return color;
 }
 
-static struct evsel *perf_stat__find_event(struct evlist *evsel_list,
-   const char *name)
-{
-   struct evsel *c2;
-
-   evlist__for_each_entry (evsel_list, c2) {
-   if (!strcasecmp(c2->name, name) && !c2->collect_stat)
-   return c2;
-   }
-   return NULL;
-}
-
-/* Mark MetricExpr target events and link events using them to them. */
-void perf_stat__collect_metric_expr(struct evlist *evsel_list)
-{
-   struct evsel *counter, *leader, **metric_events, *oc;
-   bool found;
-   struct expr_parse_ctx *ctx;
-   struct hashmap_entry *cur;
-   size_t bkt;
-   int i;
-
-   ctx = expr__ctx_new();
-   if (!ctx) {
-   pr_debug("expr__ctx_new failed");
-   return;
-   }
-   evlist__for_each_entry(evsel_list, counter) {
-   bool invalid = false;
-
-   leader = evsel__leader(counter);
-   if (!counter->metric_expr)
-   continue;
-
-   expr__ctx_clear(ctx);
-   metric_events = counter->metric_events;
-   if (!metric_events) {
-   if (expr__find_ids(counter->metric_expr,
-  counter->name,
- 

[PATCH v2 6/9] perf jevents: Combine table prefix and suffix writing

2022-12-21 Thread Ian Rogers
Combine into a single function to simplify, in a later change, writing
metrics separately.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 36 +---
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index ee3d4cdf01be..7b9714b25d0a 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -19,10 +19,10 @@ _sys_event_tables = []
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
-# Track whether an events table is currently being defined and needs closing.
-_close_table = False
 # Events to write out when the table is closed
 _pending_events = []
+# Name of table to be written out
+_pending_events_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
@@ -376,24 +376,13 @@ def preprocess_arch_std_files(archpath: str) -> None:
   _arch_std_events[event.name.lower()] = event
 
 
-def print_events_table_prefix(tblname: str) -> None:
-  """Called when a new events table is started."""
-  global _close_table
-  if _close_table:
-raise IOError('Printing table prefix but last table has no suffix')
-  _args.output_file.write(f'static const struct compact_pmu_event {tblname}[] 
= {{\n')
-  _close_table = True
-
-
 def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
   """Add contents of file to _pending_events table."""
-  if not _close_table:
-raise IOError('Table entries missing prefix')
   for e in read_json_events(item.path, topic):
 _pending_events.append(e)
 
 
-def print_events_table_suffix() -> None:
+def print_pending_events() -> None:
   """Optionally close events table."""
 
   def event_cmp_key(j: JsonEvent) -> Tuple[bool, str, str, str, str]:
@@ -405,17 +394,19 @@ def print_events_table_suffix() -> None:
 return (j.desc is not None, fix_none(j.topic), fix_none(j.name), 
fix_none(j.pmu),
 fix_none(j.metric_name))
 
-  global _close_table
-  if not _close_table:
+  global _pending_events
+  if not _pending_events:
 return
 
-  global _pending_events
+  global _pending_events_tblname
+  _args.output_file.write(
+  f'static const struct compact_pmu_event {_pending_events_tblname}[] = 
{{\n')
+
   for event in sorted(_pending_events, key=event_cmp_key):
 _args.output_file.write(event.to_c_string())
-_pending_events = []
+  _pending_events = []
 
   _args.output_file.write('};\n\n')
-  _close_table = False
 
 def get_topic(topic: str) -> str:
   if topic.endswith('metrics.json'):
@@ -453,12 +444,13 @@ def process_one_file(parents: Sequence[str], item: 
os.DirEntry) -> None:
 
   # model directory, reset topic
   if item.is_dir() and is_leaf_dir(item.path):
-print_events_table_suffix()
+print_pending_events()
 
 tblname = file_name_to_table_name(parents, item.name)
 if item.name == 'sys':
   _sys_event_tables.append(tblname)
-print_events_table_prefix(tblname)
+global _pending_events_tblname
+_pending_events_tblname = tblname
 return
 
   # base dir or too deep
@@ -802,7 +794,7 @@ struct compact_pmu_event {
   for arch in archs:
 arch_path = f'{_args.starting_dir}/{arch}'
 ftw(arch_path, [], process_one_file)
-print_events_table_suffix()
+print_pending_events()
 
   print_mapping_table(archs)
   print_system_mapping_table()
-- 
2.39.0.314.g84b9a713c41-goog



[PATCH v2 7/9] perf pmu-events: Introduce pmu_metrics_table

2022-12-21 Thread Ian Rogers
Add a metrics table that is just a cast from pmu_events_table. This
changes the APIs so that event and metric usage of the underlying
table is different. Later changes will separate the tables.

This introduction fixes a NO_JEVENTS=1 regression on:
 68: Parse and process metrics   : Ok
 70: Event expansion for cgroups : Ok
caused by the necessary test metrics not being found.

Signed-off-by: Ian Rogers 
---
 tools/perf/arch/arm64/util/pmu.c | 23 ++-
 tools/perf/pmu-events/empty-pmu-events.c | 52 
 tools/perf/pmu-events/jevents.py | 24 ---
 tools/perf/pmu-events/pmu-events.h   | 10 +++--
 tools/perf/tests/expand-cgroup.c |  4 +-
 tools/perf/tests/parse-metric.c  |  4 +-
 tools/perf/tests/pmu-events.c|  5 ++-
 tools/perf/util/metricgroup.c| 50 +++
 tools/perf/util/metricgroup.h|  2 +-
 tools/perf/util/pmu.c|  9 +++-
 tools/perf/util/pmu.h|  1 +
 11 files changed, 133 insertions(+), 51 deletions(-)

diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
index 477e513972a4..f8ae479a06db 100644
--- a/tools/perf/arch/arm64/util/pmu.c
+++ b/tools/perf/arch/arm64/util/pmu.c
@@ -19,7 +19,28 @@ const struct pmu_events_table *pmu_events_table__find(void)
if (pmu->cpus->nr != cpu__max_cpu().cpu)
return NULL;
 
-   return perf_pmu__find_table(pmu);
+   return perf_pmu__find_events_table(pmu);
+   }
+
+   return NULL;
+}
+
+const struct pmu_metrics_table *pmu_metrics_table__find(void)
+{
+   struct perf_pmu *pmu = NULL;
+
+   while ((pmu = perf_pmu__scan(pmu))) {
+   if (!is_pmu_core(pmu->name))
+   continue;
+
+   /*
+* The cpumap should cover all CPUs. Otherwise, some CPUs may
+* not support some events or have different event IDs.
+*/
+   if (pmu->cpus->nr != cpu__max_cpu().cpu)
+   return NULL;
+
+   return perf_pmu__find_metrics_table(pmu);
}
 
return NULL;
diff --git a/tools/perf/pmu-events/empty-pmu-events.c 
b/tools/perf/pmu-events/empty-pmu-events.c
index 5572a4d1eddb..d50f60a571dd 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -278,14 +278,12 @@ int pmu_events_table_for_each_event(const struct 
pmu_events_table *table, pmu_ev
return 0;
 }
 
-int pmu_events_table_for_each_metric(const struct pmu_events_table *etable, 
pmu_metric_iter_fn fn,
-void *data)
+int pmu_metrics_table_for_each_metric(const struct pmu_metrics_table *table, 
pmu_metric_iter_fn fn,
+ void *data)
 {
-   struct pmu_metrics_table *table = (struct pmu_metrics_table *)etable;
-
for (const struct pmu_metric *pm = &table->entries[0]; pm->metric_group 
|| pm->metric_name;
 pm++) {
-   int ret = fn(pm, etable, data);
+   int ret = fn(pm, table, data);
 
if (ret)
return ret;
@@ -293,7 +291,7 @@ int pmu_events_table_for_each_metric(const struct 
pmu_events_table *etable, pmu_
return 0;
 }
 
-const struct pmu_events_table *perf_pmu__find_table(struct perf_pmu *pmu)
+const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu 
*pmu)
 {
const struct pmu_events_table *table = NULL;
char *cpuid = perf_pmu__getcpuid(pmu);
@@ -321,6 +319,34 @@ const struct pmu_events_table *perf_pmu__find_table(struct 
perf_pmu *pmu)
return table;
 }
 
+const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu 
*pmu)
+{
+   const struct pmu_metrics_table *table = NULL;
+   char *cpuid = perf_pmu__getcpuid(pmu);
+   int i;
+
+   /* on some platforms which uses cpus map, cpuid can be NULL for
+* PMUs other than CORE PMUs.
+*/
+   if (!cpuid)
+   return NULL;
+
+   i = 0;
+   for (;;) {
+   const struct pmu_events_map *map = &pmu_events_map[i++];
+
+   if (!map->cpuid)
+   break;
+
+   if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
+   table = &map->metric_table;
+   break;
+   }
+   }
+   free(cpuid);
+   return table;
+}
+
 const struct pmu_events_table *find_core_events_table(const char *arch, const 
char *cpuid)
 {
for (const struct pmu_events_map *tables = &pmu_events_map[0];
@@ -332,6 +358,17 @@ const struct pmu_events_table 
*find_core_events_table(const char *arch, const ch
return NULL;
 }
 
+const struct pmu_metrics_table *find_core_metrics_table(con

[PATCH v2 8/9] perf jevents: Generate metrics and events as separate tables

2022-12-21 Thread Ian Rogers
Turn a perf json event into an event, metric or both. This reduces the
number of events needed to scan to find an event or metric. As events
no longer need the relatively seldom used metric fields, 4 bytes is
saved per event. This reduces the big C string's size by 335kb (14.8%)
on x86.

Signed-off-by: Ian Rogers 
---
 tools/perf/pmu-events/jevents.py | 244 +++
 tools/perf/tests/pmu-events.c|   3 +-
 2 files changed, 189 insertions(+), 58 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index be2cf8a8779c..c98443319145 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -13,28 +13,40 @@ import collections
 
 # Global command line arguments.
 _args = None
+# List of regular event tables.
+_event_tables = []
 # List of event tables generated from "/sys" directories.
 _sys_event_tables = []
+# List of regular metric tables.
+_metric_tables = []
+# List of metric tables generated from "/sys" directories.
+_sys_metric_tables = []
+# Mapping between sys event table names and sys metric table names.
+_sys_event_table_to_metric_table_mapping = {}
 # Map from an event name to an architecture standard
 # JsonEvent. Architecture standard events are in json files in the top
 # f'{_args.starting_dir}/{_args.arch}' directory.
 _arch_std_events = {}
 # Events to write out when the table is closed
 _pending_events = []
-# Name of table to be written out
+# Name of events table to be written out
 _pending_events_tblname = None
+# Metrics to write out when the table is closed
+_pending_metrics = []
+# Name of metrics table to be written out
+_pending_metrics_tblname = None
 # Global BigCString shared by all structures.
 _bcs = None
 # Order specific JsonEvent attributes will be visited.
 _json_event_attributes = [
 # cmp_sevent related attributes.
-'name', 'pmu', 'topic', 'desc', 'metric_name', 'metric_group',
+'name', 'pmu', 'topic', 'desc',
 # Seems useful, put it early.
 'event',
 # Short things in alphabetical order.
 'aggr_mode', 'compat', 'deprecated', 'perpkg', 'unit',
 # Longer things (the last won't be iterated over during decompress).
-'metric_constraint', 'metric_expr', 'long_desc'
+'long_desc'
 ]
 
 # Attributes that are in pmu_metric rather than pmu_event.
@@ -52,14 +64,16 @@ def removesuffix(s: str, suffix: str) -> str:
   return s[0:-len(suffix)] if s.endswith(suffix) else s
 
 
-def file_name_to_table_name(parents: Sequence[str], dirname: str) -> str:
+def file_name_to_table_name(prefix: str, parents: Sequence[str],
+dirname: str) -> str:
   """Generate a C table name from directory names."""
-  tblname = 'pme'
+  tblname = prefix
   for p in parents:
 tblname += '_' + p
   tblname += '_' + dirname
   return tblname.replace('-', '_')
 
+
 def c_len(s: str) -> int:
   """Return the length of s a C string
 
@@ -277,7 +291,7 @@ class JsonEvent:
 self.metric_constraint = jd.get('MetricConstraint')
 self.metric_expr = None
 if 'MetricExpr' in jd:
-   self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
+  self.metric_expr = metric.ParsePerfJson(jd['MetricExpr']).Simplify()
 
 arch_std = jd.get('ArchStdEvent')
 if precise and self.desc and '(Precise Event)' not in self.desc:
@@ -326,23 +340,24 @@ class JsonEvent:
 s += f'\t{attr} = {value},\n'
 return s + '}'
 
-  def build_c_string(self) -> str:
+  def build_c_string(self, metric: bool) -> str:
 s = ''
-for attr in _json_event_attributes:
+for attr in _json_metric_attributes if metric else _json_event_attributes:
   x = getattr(self, attr)
-  if x and attr == 'metric_expr':
+  if metric and x and attr == 'metric_expr':
 # Convert parsed metric expressions into a string. Slashes
 # must be doubled in the file.
 x = x.ToPerfJson().replace('\\', '')
   s += f'{x}\\000' if x else '\\000'
 return s
 
-  def to_c_string(self) -> str:
+  def to_c_string(self, metric: bool) -> str:
 """Representation of the event as a C struct initializer."""
 
-s = self.build_c_string()
+s = self.build_c_string(metric)
 return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'
 
+
 @lru_cache(maxsize=None)
 def read_json_events(path: str, topic: str) -> Sequence[JsonEvent]:
   """Read json events

  1   2   >