[PATCH 0/7] Split Coresight decode by aux records

2021-02-12 Thread James Clark
or if they can be pulled from elsewhere. I've also tested perf inject which is now working with troublesome files. Thanks James James Clark (7): perf cs-etm: Split up etm queue setup function perf cs-etm: Only search timestamp in current sample's queue. perf cs-etm: Save aux

[PATCH 1/7] perf cs-etm: Split up etm queue setup function

2021-02-12 Thread James Clark
Refactor the function into separate allocation and timestamp search parts. Later the timestamp search will be done multiple times. Signed-off-by: James Clark --- tools/perf/util/cs-etm.c | 60 +--- 1 file changed, 31 insertions(+), 29 deletions(-) diff --git

[PATCH 4/7] perf cs-etm: don't process queues until cs_etm__flush_events

2021-02-12 Thread James Clark
processing. Signed-off-by: James Clark --- tools/perf/util/cs-etm.c | 4 1 file changed, 4 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 88b541b2a804..5ab037c2dabe 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -2398,10 +2398,6

[PATCH 2/7] perf cs-etm: Only search timestamp in current sample's queue.

2021-02-12 Thread James Clark
x27;t syntesise any events start working and generating events. I'm not sure of the reason for that. I'd expect this change to only affect the ordering of events. Signed-off-by: James Clark --- tools/perf/util/cs-etm.c | 30 ++ 1 file changed, 14 insertions(+

[PATCH 3/7] perf cs-etm: Save aux records in each etm queue

2021-02-12 Thread James Clark
: James Clark --- tools/perf/util/cs-etm.c | 32 +--- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8f8b448632fb..88b541b2a804 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c

[PATCH 6/7] perf cs-etm: Use existing decode code path for --dump-raw-trace

2021-02-12 Thread James Clark
printing has to be suppressed around each call to reset. Signed-off-by: James Clark --- tools/perf/util/cs-etm.c | 91 1 file changed, 36 insertions(+), 55 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 3026fcf50b5d

[PATCH 5/7] perf cs-etm: split decode by aux records.

2021-02-12 Thread James Clark
The trace data between aux records is not continuous, so the decoder must be reset between each record to ensure that parsing happens correctly and without any early exits. Signed-off-by: James Clark --- tools/perf/util/cs-etm.c | 109 +++ 1 file changed, 64

[PATCH 7/7] perf cs-etm: Suppress printing when resetting decoder

2021-02-12 Thread James Clark
The decoder is quite noisy when being reset. Now that dump-raw-trace uses a code path that resets the decoder rather than creating a new one, printing has to be suppressed to not flood the output. Signed-off-by: James Clark --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 +++--- 1

Re: [PATCH 8/8] perf arm-spe: Set thread TID

2021-02-10 Thread James Clark
On 09/02/2021 17:36, James Clark wrote: > > > On 04/02/2021 12:27, Leo Yan wrote: >> On Mon, Feb 01, 2021 at 07:40:45PM +0200, James Clark wrote: >>> >>> On 31/01/2021 14:01, Leo Yan wrote: >>>> Option 1: by merging patches 07/08 and 08/08, we

[PATCH v2 2/6] perf arm-spe: Store memory address in packet

2021-02-11 Thread James Clark
From: Leo Yan This patch is to store virtual and physical memory addresses in packet, which will be used for memory samples. Signed-off-by: Leo Yan Signed-off-by: James Clark Reviewed-by: James Clark Tested-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo

[PATCH v2 1/6] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC

2021-02-11 Thread James Clark
From: Leo Yan This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in the perf data, when output the tracing data, it tells tools that it contains data source in the memory event. Signed-off-by: Leo Yan Signed-off-by: James Clark Reviewed-by: James Clark Tested-by: James

[PATCH v2 3/6] perf arm-spe: Store operation type in packet

2021-02-11 Thread James Clark
From: Leo Yan This patch is to store operation type in packet structure. Signed-off-by: Leo Yan Signed-off-by: James Clark Reviewed-by: James Clark Tested-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri

[PATCH v2 4/6] perf arm-spe: Fill address info for samples

2021-02-11 Thread James Clark
virtual and physical address through packets, the address info is stored into the synthesize samples in the function arm_spe__synth_mem_sample(). Signed-off-by: Leo Yan Signed-off-by: James Clark Reviewed-by: James Clark Tested-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo

[PATCH v2 5/6] perf arm-spe: Synthesize memory event

2021-02-11 Thread James Clark
itrace option '--itrace=M' to filter out other events and only output memory events, this can significantly reduce the overhead caused by generating samples. This patch is to enable memory event for Arm SPE. Signed-off-by: Leo Yan Signed-off-by: James Clark Reviewed-by: James Clark

[PATCH v2 6/6] perf arm-spe: Set sample's data source field

2021-02-11 Thread James Clark
N/A Walker hit 0.42% 322 0 L1 miss [.] 0x09d8 serial_c [.] 0x80794580 anon N/A Walker hit Signed-off-by: Leo Yan Signed-off-by: James Clark Reviewed-by: James Clark Tested-by: James Clark Cc: Pet

Re: [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC

2021-02-11 Thread James Clark
On 22/01/2021 14:51, Arnaldo Carvalho de Melo wrote: > Em Tue, Jan 19, 2021 at 04:46:51PM +0200, James Clark escreveu: >> From: Leo Yan >> >> This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in >> the perf data, when output the tracing d

Re: [PATCH 8/8] perf arm-spe: Set thread TID

2021-02-01 Thread James Clark
and it can always be added on top of option 1 or replace what is there. But I don't know when I would get to it or how long it will take. James > >> Signed-off-by: Leo Yan >> Signed-off-by: James Clark > > Besides for techinical question, you could add your &qu

Re: [PATCH 4/4] perf tools: determine if LR is the return address

2021-02-08 Thread James Clark
On 22/01/2021 18:18, Alexandre Truong wrote: > +} > + > +static int add_entry(struct unwind_entry *entry, void *arg) > +{ > + struct entries *entries = arg; > + > + entries->stack[entries->i++] = entry->ip; > + return 0; > +} > + > +u64 get_leaf_frame_caller_aarch64(struct perf_samp

[RFC PATCH 1/5] perf cs-etm: Split up etm queue setup function

2021-02-09 Thread James Clark
Refactor the function into separate allocation and timestamp search parts. Later the timestamp search will be done multiple times. Signed-off-by: James Clark --- tools/perf/util/cs-etm.c | 60 +--- 1 file changed, 31 insertions(+), 29 deletions(-) diff --git

[RFC PATCH 3/5] perf cs-etm: Save aux records in each etm queue

2021-02-09 Thread James Clark
: James Clark --- tools/perf/util/cs-etm.c | 24 +++- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 9ebe43d60d1e..efe418a7c82e 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -92,12

[RFC PATCH 0/5] Split Coresight decode by aux records

2021-02-09 Thread James Clark
be pulled from elsewhere? I also have some further changes to make to make per-thread mode work where the cpu field of the sample is set to -1. And when there are no timestamps cs_etm__process_timeless_queues() is used, which is a completely different code path. Thanks James James Clark (5): pe

[RFC PATCH 4/5] perf cs-etm: don't process queues until cs_etm__flush_events

2021-02-09 Thread James Clark
processing. Signed-off-by: James Clark --- tools/perf/util/cs-etm.c | 4 1 file changed, 4 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index efe418a7c82e..0aaa1f6d2822 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -2394,10 +2394,6

[RFC PATCH 2/5] perf cs-etm: Only search timestamp in current sample's queue.

2021-02-09 Thread James Clark
rther change will be required. Also this change makes some files that had coresight data but didn't syntesise any events start working and generating events. I'm not sure of the reason for that. I'd expect this change to only affect the ordering of events. Signed-off-by: James Clark ---

[RFC PATCH 5/5] perf cs-etm: split decode by aux records.

2021-02-09 Thread James Clark
The trace data between aux records is not continuous, so the decoder must be reset between each record to ensure that parsing happens correctly and without any early exits. Signed-off-by: James Clark --- tools/perf/util/cs-etm.c | 108 --- 1 file changed, 66

Re: [PATCH 8/8] perf arm-spe: Set thread TID

2021-02-09 Thread James Clark
On 04/02/2021 12:27, Leo Yan wrote: > On Mon, Feb 01, 2021 at 07:40:45PM +0200, James Clark wrote: >> >> On 31/01/2021 14:01, Leo Yan wrote: >>> Option 1: by merging patches 07/08 and 08/08, we can firstly support PID >>> tracing for root namespace, and l

[PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC

2021-01-19 Thread James Clark
From: Leo Yan This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in the perf data, when output the tracing data, it tells tools that it contains data source in the memory event. Signed-off-by: Leo Yan Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc

[PATCH 2/8] perf arm-spe: Store memory address in packet

2021-01-19 Thread James Clark
From: Leo Yan This patch is to store virtual and physical memory addresses in packet, which will be used for memory samples. Signed-off-by: Leo Yan Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri

[PATCH 3/8] perf arm-spe: Store operation type in packet

2021-01-19 Thread James Clark
From: Leo Yan This patch is to store operation type in packet structure. Signed-off-by: Leo Yan Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: John Garry Cc: Will

[PATCH 4/8] perf arm-spe: Fill address info for samples

2021-01-19 Thread James Clark
virtual and physical address through packets, the address info is stored into the synthesize samples in the function arm_spe__synth_mem_sample(). Signed-off-by: Leo Yan Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander

[PATCH 6/8] perf arm-spe: Set sample's data source field

2021-01-19 Thread James Clark
N/A Walker hit 0.42% 322 0 L1 miss [.] 0x09d8 serial_c [.] 0x80794580 anon N/A Walker hit Signed-off-by: Leo Yan Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalh

[PATCH 7/8] perf arm-spe: Save context ID in record

2021-01-19 Thread James Clark
From: Leo Yan This patch is to save context ID in record, this will be used to set TID for samples. Signed-off-by: Leo Yan Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim

[PATCH 8/8] perf arm-spe: Set thread TID

2021-01-19 Thread James Clark
first process is assigned to each SPE sample. Signed-off-by: Leo Yan Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: John Garry Cc: Will Deacon Cc: Mathieu Poirier Cc: Al

[PATCH 5/8] perf arm-spe: Synthesize memory event

2021-01-19 Thread James Clark
itrace option '--itrace=M' to filter out other events and only output memory events, this can significantly reduce the overhead caused by generating samples. This patch is to enable memory event for Arm SPE. Signed-off-by: Leo Yan Signed-off-by: James Clark Cc: Peter Zijlstra Cc: I

[PATCH] perf tools: Update OpenCSD to v1.0.0

2021-01-08 Thread James Clark
. Increase the minimum version number to v1.0.0 now that new enum values are used that are only present in this version. Signed-off-by: James Clark Cc: John Garry Cc: Will Deacon Cc: Mathieu Poirier Cc: Leo Yan Cc: Suzuki K Poulose Cc: Mike Leach Cc: Al Grant Cc: Peter Zijlstra Cc: Ingo Molnar

Re: [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address

2021-03-05 Thread James Clark
t the cost of not showing better frame pointer stacks by default. Tested-by: James Clark On 04/03/2021 18:32, Alexandre Truong wrote: > On arm64 and frame pointer mode (e.g: perf record --callgraph fp), > use dwarf unwind info to check if the link register is the return > address in ord

Re: [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address

2021-03-26 Thread James Clark
gs_mask = ((1ULL << PERF_REG_ARM64_MAX) - > 1); >     return callchain_param.record_mode == CALLCHAIN_FP && > sample->user_regs.regs > -       && sample->user_regs.mask == PERF_REGS_MASK; > +   && sample->user_regs.mask

Re: [PATCH 2/2] perf cs-etm: Set time on synthesised samples to preserve ordering

2021-04-15 Thread James Clark
On 15/04/2021 15:39, Leo Yan wrote: > On Wed, Apr 14, 2021 at 05:41:46PM +0300, James Clark wrote: >> Hi, >> >> For this change, I also tried removing the setting of PERF_SAMPLE_TIME in >> cs_etm__synth_events(). In theory, this would remove the sorting when &

Re: [PATCH v4 1/6] perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS

2021-04-15 Thread James Clark
On 12/04/2021 12:10, Leo Yan wrote: > The enum value 'ARM_SPE_PER_CPU_MMAPS' is never used so remove it. Hi Leo, I think this causes an error when attempting to open a newly recorded file with an old version of perf. The value ARM_SPE_AUXTRACE_PRIV_MAX is used here: size_t min_sz = si

Re: [PATCH v4 0/6] perf arm-spe: Enable timestamp

2021-04-15 Thread James Clark
Hi Leo, I was looking at testing this on N1SDP and I thought I would try the round trip with perf inject and then perf report but saw that perf inject with SPE always results in an error (unrelated to your change) -> ./perf report -i per-thread-spe-time.inject.data 0x1328 [0x8]

Re: [PATCH v4 4/6] perf arm-spe: Assign kernel time to synthesized event

2021-04-15 Thread James Clark
On 12/04/2021 12:10, Leo Yan wrote: > In current code, it assigns the arch timer counter to the synthesized > samples Arm SPE trace, thus the samples don't contain the kernel time > but only contain the raw counter value. > > To fix the issue, this patch converts the timer counter to kernel tim

Re: [PATCH v4 1/6] perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS

2021-04-15 Thread James Clark
On 15/04/2021 17:41, Leo Yan wrote: > Hi James, > > On Thu, Apr 15, 2021 at 05:13:36PM +0300, James Clark wrote: >> On 12/04/2021 12:10, Leo Yan wrote: >>> The enum value 'ARM_SPE_PER_CPU_MMAPS' is never used so remove it. >> >> Hi Leo, >>

Re: [PATCH 2/2] perf cs-etm: Set time on synthesised samples to preserve ordering

2021-04-16 Thread James Clark
On 15/04/2021 17:33, Leo Yan wrote: > Hi James, > > On Thu, Apr 15, 2021 at 03:51:46PM +0300, James Clark wrote: > > [...] > >>> For the orignal perf data file with "--per-thread" option, the decoder >>> runs into the condition for "etm-

Re: [PATCH 2/2] perf cs-etm: Set time on synthesised samples to preserve ordering

2021-04-16 Thread James Clark
On 15/04/2021 22:54, Mathieu Poirier wrote: > On Wed, Apr 14, 2021 at 05:39:19PM +0300, James Clark wrote: >> The following attribute is set when synthesising samples in >> timed decoding mode: >> >> attr.sample_type |= PERF_SAMPLE_TIME; >> >> This res

[PATCH v2 0/2] perf cs-etm: Set time on synthesised samples to preserve ordering

2021-04-16 Thread James Clark
Changes since v1: * Improved variable name from etm_timestamp -> cs_timestamp * Fixed ordering of Signed-off-by James Clark (2): perf cs-etm: Refactor timestamp variable names perf cs-etm: Set time on synthesised samples to preserve ordering .../perf/util/cs-etm-decoder/cs-etm-decode

[PATCH v2 2/2] perf cs-etm: Set time on synthesised samples to preserve ordering

2021-04-16 Thread James Clark
tm__process_queues(). Co-developed-by: Al Grant Signed-off-by: Al Grant Signed-off-by: James Clark --- tools/perf/util/cs-etm.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 533f6f2f0685..e5c1a1b22a2a 100

[PATCH v2 1/2] perf cs-etm: Refactor timestamp variable names

2021-04-16 Thread James Clark
refers to sample kernel timestamps, and the /timestamp/ event modifier refers to CS timestamps, so the term is overloaded. Signed-off-by: James Clark --- .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 18 tools/perf/util/cs-etm.c | 42 +-- tools/perf

Re: [PATCH v4 4/6] perf arm-spe: Assign kernel time to synthesized event

2021-04-16 Thread James Clark
On 15/04/2021 18:23, Leo Yan wrote: > On Thu, Apr 15, 2021 at 05:46:31PM +0300, James Clark wrote: >> >> >> On 12/04/2021 12:10, Leo Yan wrote: >>> In current code, it assigns the arch timer counter to the synthesized >>> samples Arm SPE trace, thus th

[PATCH 1/2] perf cs-etm: Refactor timestamp variable names

2021-04-14 Thread James Clark
refers to sample kernel timestamps, and the /timestamp/ event modifier refers to etm timestamps, so the term is overloaded. Signed-off-by: James Clark --- .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 18 tools/perf/util/cs-etm.c | 42 +-- tools

[PATCH 2/2] perf cs-etm: Set time on synthesised samples to preserve ordering

2021-04-14 Thread James Clark
to cs_etm__process_queues(). Signed-off-by: James Clark Co-developed-by: Al Grant Signed-off-by: Al Grant --- tools/perf/util/cs-etm.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index c25da2ffa8f3..d0fa9dce47f1 100

Re: [PATCH 2/2] perf cs-etm: Set time on synthesised samples to preserve ordering

2021-04-14 Thread James Clark
or maybe just checking the options, although that's not how it's done in cs_etm__is_timeless_decoding() currently). Or, we could force /time/ and /timestamp/ options to always be enabled together in the record stage. Thanks James On 14/04/2021 17:39, James Clark wrote: > The follow

Re: [PATCH v2 0/2] perf cs-etm: Set time on synthesised samples to preserve ordering

2021-04-19 Thread James Clark
On 16/04/2021 18:16, Arnaldo Carvalho de Melo wrote: > Em Fri, Apr 16, 2021 at 09:07:09AM -0600, Mathieu Poirier escreveu: >> Hi James, >> >> On Fri, Apr 16, 2021 at 01:56:30PM +0300, James Clark wrote: >>> Changes since v1: >>> * Improved variable

[PATCH v5 01/12] perf tools: Improve topology test

2020-11-17 Thread James Clark
Improve the topology test to check all aggregation types. This is to lock down the behaviour before 'id' is changed into a struct in later commits. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin

[PATCH v5 06/12] perf tools: drop in cpu_aggr_map struct

2020-11-17 Thread James Clark
Replace usages of perf_cpu_map with cpu_aggr map in places that are involved with perf stat aggregation. This will then later be changed to be a map of cpu_aggr_id rather than an int so that more data can be stored. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo

[PATCH v5 07/12] perf tools: Start using cpu_aggr_id in map

2020-11-17 Thread James Clark
Use the new cpu_aggr_id struct in the cpu map instead of int so that it can store more data. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas

[PATCH v5 11/12] perf tools: Add separate core member

2020-11-17 Thread James Clark
Add core as a separate member so that it doesn't have to be packed into the int value. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc: John

[PATCH v5 09/12] perf tools: Add separate socket member

2020-11-17 Thread James Clark
socket ID: ./perf stat --per-die -a Performance counter stats for 'system wide': S36-D0 128 169,869.39 msec cpu-clock # 127.501 CPUs utilized ... S3612-D0 128 169,733.05 msec cpu-clock # 127.398 CPUs ut

[PATCH v5 05/12] perf tools: add new map type for aggregation

2020-11-17 Thread James Clark
Currently this is a duplicate of perf_cpu_map so that it can be used as a drop in replacement. In a later commit it will be changed from a map of ints to use the new cpu_aggr_id struct. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho

[PATCH v5 00/12] perf tools: fix perf stat with large socket IDs

2020-11-17 Thread James Clark
x27;t have to be changed to static in a separate commit James Clark (12): perf tools: Improve topology test perf tools: Use allocator for perf_cpu_map perf tools: Add new struct for cpu aggregation perf tools: Replace aggregation ID with a struct perf tools: add new map type for aggreg

[PATCH v5 12/12] perf tools: Add separate thread member

2020-11-17 Thread James Clark
ch is now no longer used. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc: John Garry --- tools/perf/tests/topology.c| 8 tools/perf

Re: [PATCH 02/13 v4] perf tools: Use allocator for perf_cpu_map

2020-11-17 Thread James Clark
On 15/11/2020 23:17, Jiri Olsa wrote: > On Fri, Nov 13, 2020 at 07:26:43PM +0200, James Clark wrote: >> Use the existing allocator for perf_cpu_map to avoid use >> of raw malloc. This could cause an issue in later commits >> where the size of perf_cpu_map is changed. >&g

Re: [PATCH 04/13 v4] perf tools: Replace aggregation ID with a struct

2020-11-17 Thread James Clark
On 15/11/2020 23:17, Jiri Olsa wrote: > On Fri, Nov 13, 2020 at 07:26:45PM +0200, James Clark wrote: > > SNIP > >> @@ -754,7 +766,7 @@ static void print_aggr_thread(struct perf_stat_config >> *config, >> FILE *output = config->output; >> int

[PATCH v5 03/12] perf tools: Add new struct for cpu aggregation

2020-11-17 Thread James Clark
This struct currently has only a single int member so that it can be used as a drop in replacement for the existing behaviour. Comparison and constructor functions have also been added that will replace usages of '==' and '= -1'. No functional changes. Signed-off-by: Ja

[PATCH v5 02/12] perf tools: Use allocator for perf_cpu_map

2020-11-17 Thread James Clark
Use the existing allocator for perf_cpu_map to avoid use of raw malloc. This could cause an issue in later commits where the size of perf_cpu_map is changed. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc

[PATCH v5 08/12] perf tools: Add separate node member

2020-11-17 Thread James Clark
Add node as a separate member so that it doesn't have to be packed into the int value. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc: John

[PATCH v5 10/12] perf tools: Add separate die member

2020-11-17 Thread James Clark
Add die as a separate member so that it doesn't have to be packed into the int value. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc: John

[PATCH v5 04/12] perf tools: Replace aggregation ID with a struct

2020-11-17 Thread James Clark
Replace all occurences of the usage of int with the new struct cpu_aggr_id. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc

Re: [PATCH 07/13 v4] perf tools: restrict visibility of functions

2020-11-17 Thread James Clark
On 15/11/2020 23:17, Jiri Olsa wrote: > On Fri, Nov 13, 2020 at 07:26:48PM +0200, James Clark wrote: >> These cpu_aggr_map refcounting functions are only used in >> builtin-stat.c so their visibilty can be reduced to just >> that file. >> >> No functional ch

Re: [PATCH 13/13 v4] perf tools: add thread field

2020-11-17 Thread James Clark
On 15/11/2020 23:17, Jiri Olsa wrote: > On Fri, Nov 13, 2020 at 07:26:54PM +0200, James Clark wrote: >> A separate field isn't strictly required. The core >> field could be re-used for thread IDs as a single >> field was used previously. >> >> But separati

Re: [PATCH] perf tools: add aarch64 registers to --user-regs

2020-11-30 Thread James Clark
x40x006c > x50x00100101 >... thread: ls:51956 > .. dso: /usr/lib64/ld-2.17.so > Checked that the registers can be listed with =? and that recording different combinations of registers works as expected. Tested-by: James Clark

[PATCH] drivers/perf: Enable PID_IN_CONTEXTIDR with SPE

2020-11-30 Thread James Clark
small performance overhead when enabling PID_IN_CONTEXTIDR, but SPE itself is optional and not enabled by default so the impact is minimised. Cc: Will Deacon Cc: Mark Rutland Cc: Al Grant Cc: Leo Yan Cc: John Garry Cc: Suzuki K Poulose Signed-off-by: James Clark --- drivers/perf/Kconfig | 1 + 1

Re: [PATCH v5 01/12] perf tools: Improve topology test

2020-11-26 Thread James Clark
On 18/11/2020 13:21, Namhyung Kim wrote: > Hello, > > On Tue, Nov 17, 2020 at 11:49 PM James Clark wrote: >> >> Improve the topology test to check all aggregation >> types. This is to lock down the behaviour before >> 'id' is changed into a struct i

[PATCH v6 01/12] perf tools: Improve topology test

2020-11-26 Thread James Clark
Improve the topology test to check all aggregation types. This is to lock down the behaviour before 'id' is changed into a struct in later commits. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin

[PATCH v6 04/12] perf tools: Replace aggregation ID with a struct

2020-11-26 Thread James Clark
Replace all occurences of the usage of int with the new struct cpu_aggr_id. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc

[PATCH v6 05/12] perf tools: add new map type for aggregation

2020-11-26 Thread James Clark
Currently this is a duplicate of perf_cpu_map so that it can be used as a drop in replacement. In a later commit it will be changed from a map of ints to use the new cpu_aggr_id struct. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho

[PATCH v6 00/12] perf tools: fix perf stat with large socket IDs

2020-11-26 Thread James Clark
Changes since v5: * Fix test for cpu_map__get_die() by shifting id before testing. * Fix test for cpu_map__get_socket() by not using cpu_map__id_to_socket() which is only valid in CPU aggregation mode. James Clark (12): perf tools: Improve topology test perf tools: Use allocator for

[PATCH v6 02/12] perf tools: Use allocator for perf_cpu_map

2020-11-26 Thread James Clark
Use the existing allocator for perf_cpu_map to avoid use of raw malloc. This could cause an issue in later commits where the size of perf_cpu_map is changed. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc

[PATCH v6 03/12] perf tools: Add new struct for cpu aggregation

2020-11-26 Thread James Clark
This struct currently has only a single int member so that it can be used as a drop in replacement for the existing behaviour. Comparison and constructor functions have also been added that will replace usages of '==' and '= -1'. No functional changes. Signed-off-by: Ja

[PATCH v6 07/12] perf tools: Start using cpu_aggr_id in map

2020-11-26 Thread James Clark
Use the new cpu_aggr_id struct in the cpu map instead of int so that it can store more data. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas

[PATCH v6 08/12] perf tools: Add separate node member

2020-11-26 Thread James Clark
Add node as a separate member so that it doesn't have to be packed into the int value. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc: John

[PATCH v6 12/12] perf tools: Add separate thread member

2020-11-26 Thread James Clark
ch is now no longer used. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc: John Garry --- tools/perf/tests/topology.c| 8 tools/perf

[PATCH v6 11/12] perf tools: Add separate core member

2020-11-26 Thread James Clark
Add core as a separate member so that it doesn't have to be packed into the int value. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc: John

[PATCH v6 06/12] perf tools: drop in cpu_aggr_map struct

2020-11-26 Thread James Clark
Replace usages of perf_cpu_map with cpu_aggr map in places that are involved with perf stat aggregation. This will then later be changed to be a map of cpu_aggr_id rather than an int so that more data can be stored. No functional changes. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo

[PATCH v6 10/12] perf tools: Add separate die member

2020-11-26 Thread James Clark
Add die as a separate member so that it doesn't have to be packed into the int value. Signed-off-by: James Clark Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc: John

[PATCH v6 09/12] perf tools: Add separate socket member

2020-11-26 Thread James Clark
socket ID: ./perf stat --per-die -a Performance counter stats for 'system wide': S36-D0 128 169,869.39 msec cpu-clock # 127.501 CPUs utilized ... S3612-D0 128 169,733.05 msec cpu-clock # 127.398 CPUs ut

Re: [PATCH 0/7] Split Coresight decode by aux records

2021-03-01 Thread James Clark
> Thanks, > Mathieu > > On Fri, Feb 12, 2021 at 04:45:06PM +0200, James Clark wrote: >> Hi All, >> >> Since my previous RFC, I've fixed --per-thread mode and solved >> most of the open questions. I've also changed --dump-raw-trace >> to use the

Re: [PATCH 2/7] perf cs-etm: Only search timestamp in current sample's queue.

2021-03-01 Thread James Clark
On 20/02/2021 13:50, Leo Yan wrote: > On Fri, Feb 12, 2021 at 04:45:08PM +0200, James Clark wrote: >> Change initial timestamp search to only operate on the queue >> related to the current event. In a later change the bounds >> of the aux record will also be used to reset t

Re: [PATCH 3/7] perf cs-etm: Save aux records in each etm queue

2021-03-01 Thread James Clark
On 27/02/2021 09:10, Leo Yan wrote: > On Fri, Feb 12, 2021 at 04:45:09PM +0200, James Clark wrote: >> The aux records will be used set the bounds of decoding in a >> later commit. In the future we may also want to use the flags >> of each record to control decoding. >

Re: [PATCH 4/4] perf tools: determine if LR is the return address

2021-01-26 Thread James Clark
On 24/01/2021 02:05, Jiri Olsa wrote: > On Fri, Jan 22, 2021 at 04:18:54PM +, Alexandre Truong wrote: >> On arm64 and frame pointer mode (e.g: perf record --callgraph fp), >> use dwarf unwind info to check if the link register is the return >> address in order to inject it to the frame point

[PATCH v2] drivers/perf: Enable PID_IN_CONTEXTIDR with SPE

2020-12-14 Thread James Clark
small performance overhead when enabling PID_IN_CONTEXTIDR, but SPE itself is optional and not enabled by default so the impact is minimised. Cc: Will Deacon Cc: Mark Rutland Cc: Al Grant Cc: Leo Yan Cc: John Garry Cc: Suzuki K Poulose Cc: Mathieu Poirier Cc: Catalin Marinas Signed-off-by: James

Re: [PATCH] drivers/perf: Enable PID_IN_CONTEXTIDR with SPE

2020-12-14 Thread James Clark
On 02/12/2020 01:09, Will Deacon wrote: > On Tue, Dec 01, 2020 at 12:10:40PM +0800, Leo Yan wrote: >> On Mon, Nov 30, 2020 at 04:46:51PM +, Will Deacon wrote: >>> On Mon, Nov 30, 2020 at 06:24:54PM +0200, James Clark wrote: >>>> Enable PID_IN_CONTEXTIDR by d

Re: [RFC PATCH 3/3] perf report: add --spe options for arm-spe

2019-08-21 Thread James Clark
Hi, I also had a look at this and had a question about the --spe option. It seems that whatever options I give it, the output is the same: perf report And perf report --spe=t Both give the same result: # Samples: 4 of event 'llc-miss' # Event count (approx.): 4

[PATCH] Fixes hang in zstd compression test by changing the source of random data.

2019-08-22 Thread James Clark
running it. Signed-off-by: James Clark --- tools/perf/tests/shell/record+zstd_comp_decomp.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/tests/shell/record+zstd_comp_decomp.sh b/tools/perf/tests/shell/record+zstd_comp_decomp.sh index 899604d1..63a91ec 100755 --

[PATCH v2 0/1] perf tools: Add PMU event JSON files for ARM Cortex-A76 and, Neoverse N1.

2019-09-02 Thread James Clark
Resubmitting due to the previous patch having a disclaimer appended. I've tested that this applies cleanly with git am. James Clark (1): perf tools: Add PMU event JSON files for ARM Cortex-A76 and, Neoverse N1. .../arch/arm64/arm/cortex-a76-n1/branch.json | 14 ++ .../arch/arm6

[PATCH v2 1/1] perf tools: Add PMU event JSON files for ARM Cortex-A76 and, Neoverse N1.

2019-09-02 Thread James Clark
: https://static.docs.arm.com/100798/0400/cortex_a76_trm_100798_0400_00_en.pdf Signed-off-by: James Clark Cc: Jeremy Linton Cc: Suzuki Poulose Cc: Mark Rutland Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim

Re: [PATCH] Fixes hang in zstd compression test by changing the source of random data.

2019-08-23 Thread James Clark
Sorry about that, I will look into it. Thanks James On 22/08/2019 22:24, Arnaldo Carvalho de Melo wrote: > Em Thu, Aug 22, 2019 at 06:24:07PM -0300, Arnaldo Carvalho de Melo escreveu: >> Em Thu, Aug 22, 2019 at 01:55:15PM +0000, James Clark escreveu: >>> Running 'perf test

Re: [RFC PATCH 2/3] perf tools: Add support for "report" for some spe events

2019-10-09 Thread James Clark
Hi Xiaojun, > By the way, you mentioned before that you want the spe event to be in the > form of "event:pp" like pebs. Is that the whole framework should be made > similar to pebs? Or is it just a modification to the command format? We're currently still investigating if it makes sense to mod

Re: [RFC PATCH 2/3] perf tools: Add support for "report" for some spe events

2019-10-04 Thread James Clark
Hi Xiaojun, I wanted to ask if you are still working on this? I've noticed that it doesn't apply cleanly to perf/core anymore and I was working on re-basing it. Would you be interested in me posting my progress? I was also interested in decoding the "data source" of events and displaying that

Re: [RFC PATCH 2/3] perf tools: Add support for "report" for some spe events

2019-10-16 Thread James Clark
Hi Xiaojun, >> >> What do you mean when the user specifies "event:pp", if the SPE is >> available, configure and record the spe data directly via the perf event >> open syscall? >> (perf.data itself is the same as using -e arm_spe_0//xxx?) > > I mean, for the perf record, if the user does not a

[PATCH 0/1] perf tools: Add PMU event JSON files for ARM Cortex-A76 and, Neoverse N1.

2019-07-26 Thread James Clark
Hi, I'm a developer at ARM and I'm new to upstreaming here. I'd like to submit this patch for event counters for two new ARM CPUs. At some point in the future I will also continue work around arm-spe.c to improve SPE support. Thanks James IMPORTANT NOTICE: The contents of this email and any a

[PATCH 1/1] perf tools: Add PMU event JSON files for ARM Cortex-A76 and, Neoverse N1.

2019-07-26 Thread James Clark
: https://static.docs.arm.com/100798/0400/cortex_a76_trm_100798_0400_00_en.pdf Signed-off-by: James Clark --- .../arch/arm64/arm/cortex-a76-n1/branch.json | 14 ++ .../arch/arm64/arm/cortex-a76-n1/bus.json | 24 +++ .../arch/arm64/arm/cortex-a76-n1/cache.json| 207

  1   2   >