or if they can be pulled
from elsewhere.
I've also tested perf inject which is now working with troublesome
files.
Thanks
James
James Clark (7):
perf cs-etm: Split up etm queue setup function
perf cs-etm: Only search timestamp in current sample's queue.
perf cs-etm: Save aux
Refactor the function into separate allocation and
timestamp search parts. Later the timestamp search
will be done multiple times.
Signed-off-by: James Clark
---
tools/perf/util/cs-etm.c | 60 +---
1 file changed, 31 insertions(+), 29 deletions(-)
diff --git
processing.
Signed-off-by: James Clark
---
tools/perf/util/cs-etm.c | 4
1 file changed, 4 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 88b541b2a804..5ab037c2dabe 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2398,10 +2398,6
x27;t syntesise any events start working and generating
events. I'm not sure of the reason for that. I'd expect this
change to only affect the ordering of events.
Signed-off-by: James Clark
---
tools/perf/util/cs-etm.c | 30 ++
1 file changed, 14 insertions(+
: James Clark
---
tools/perf/util/cs-etm.c | 32 +---
1 file changed, 29 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 8f8b448632fb..88b541b2a804 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
printing has to
be suppressed around each call to reset.
Signed-off-by: James Clark
---
tools/perf/util/cs-etm.c | 91
1 file changed, 36 insertions(+), 55 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 3026fcf50b5d
The trace data between aux records is not continuous, so the decoder
must be reset between each record to ensure that parsing happens
correctly and without any early exits.
Signed-off-by: James Clark
---
tools/perf/util/cs-etm.c | 109 +++
1 file changed, 64
The decoder is quite noisy when being reset. Now that dump-raw-trace
uses a code path that resets the decoder rather than creating a new
one, printing has to be suppressed to not flood the output.
Signed-off-by: James Clark
---
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 +++---
1
On 09/02/2021 17:36, James Clark wrote:
>
>
> On 04/02/2021 12:27, Leo Yan wrote:
>> On Mon, Feb 01, 2021 at 07:40:45PM +0200, James Clark wrote:
>>>
>>> On 31/01/2021 14:01, Leo Yan wrote:
>>>> Option 1: by merging patches 07/08 and 08/08, we
From: Leo Yan
This patch is to store virtual and physical memory addresses in packet,
which will be used for memory samples.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Reviewed-by: James Clark
Tested-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
From: Leo Yan
This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
the perf data, when output the tracing data, it tells tools that it
contains data source in the memory event.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Reviewed-by: James Clark
Tested-by: James
From: Leo Yan
This patch is to store operation type in packet structure.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Reviewed-by: James Clark
Tested-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri
virtual and physical address through
packets, the address info is stored into the synthesize samples in the
function arm_spe__synth_mem_sample().
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Reviewed-by: James Clark
Tested-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo
itrace option '--itrace=M' to filter out other events and only
output memory events, this can significantly reduce the overhead
caused by generating samples.
This patch is to enable memory event for Arm SPE.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Reviewed-by: James Clark
N/A
Walker hit
0.42% 322 0 L1 miss [.]
0x09d8 serial_c [.] 0x80794580 anon N/A
Walker hit
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Reviewed-by: James Clark
Tested-by: James Clark
Cc: Pet
On 22/01/2021 14:51, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jan 19, 2021 at 04:46:51PM +0200, James Clark escreveu:
>> From: Leo Yan
>>
>> This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
>> the perf data, when output the tracing d
and it can always be added on top of
option 1 or replace what is there. But I don't know when I would get to it or
how long it will take.
James
>
>> Signed-off-by: Leo Yan
>> Signed-off-by: James Clark
>
> Besides for techinical question, you could add your &qu
On 22/01/2021 18:18, Alexandre Truong wrote:
> +}
> +
> +static int add_entry(struct unwind_entry *entry, void *arg)
> +{
> + struct entries *entries = arg;
> +
> + entries->stack[entries->i++] = entry->ip;
> + return 0;
> +}
> +
> +u64 get_leaf_frame_caller_aarch64(struct perf_samp
Refactor the function into separate allocation and
timestamp search parts. Later the timestamp search
will be done multiple times.
Signed-off-by: James Clark
---
tools/perf/util/cs-etm.c | 60 +---
1 file changed, 31 insertions(+), 29 deletions(-)
diff --git
: James Clark
---
tools/perf/util/cs-etm.c | 24 +++-
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 9ebe43d60d1e..efe418a7c82e 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -92,12
be pulled from elsewhere?
I also have some further changes to make to make per-thread mode work
where the cpu field of the sample is set to -1. And when there are
no timestamps cs_etm__process_timeless_queues() is used, which is a
completely different code path.
Thanks
James
James Clark (5):
pe
processing.
Signed-off-by: James Clark
---
tools/perf/util/cs-etm.c | 4
1 file changed, 4 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index efe418a7c82e..0aaa1f6d2822 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2394,10 +2394,6
rther change will be required.
Also this change makes some files that had coresight data
but didn't syntesise any events start working and generating
events. I'm not sure of the reason for that. I'd expect this
change to only affect the ordering of events.
Signed-off-by: James Clark
---
The trace data between aux records is not continuous, so the decoder
must be reset between each record to ensure that parsing happens
correctly and without any early exits.
Signed-off-by: James Clark
---
tools/perf/util/cs-etm.c | 108 ---
1 file changed, 66
On 04/02/2021 12:27, Leo Yan wrote:
> On Mon, Feb 01, 2021 at 07:40:45PM +0200, James Clark wrote:
>>
>> On 31/01/2021 14:01, Leo Yan wrote:
>>> Option 1: by merging patches 07/08 and 08/08, we can firstly support PID
>>> tracing for root namespace, and l
From: Leo Yan
This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
the perf data, when output the tracing data, it tells tools that it
contains data source in the memory event.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc
From: Leo Yan
This patch is to store virtual and physical memory addresses in packet,
which will be used for memory samples.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri
From: Leo Yan
This patch is to store operation type in packet structure.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: John Garry
Cc: Will
virtual and physical address through
packets, the address info is stored into the synthesize samples in the
function arm_spe__synth_mem_sample().
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander
N/A
Walker hit
0.42% 322 0 L1 miss [.]
0x09d8 serial_c [.] 0x80794580 anon N/A
Walker hit
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalh
From: Leo Yan
This patch is to save context ID in record, this will be used to set TID
for samples.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
first process is assigned
to each SPE sample.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: John Garry
Cc: Will Deacon
Cc: Mathieu Poirier
Cc: Al
itrace option '--itrace=M' to filter out other events and only
output memory events, this can significantly reduce the overhead
caused by generating samples.
This patch is to enable memory event for Arm SPE.
Signed-off-by: Leo Yan
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: I
.
Increase the minimum version number to v1.0.0 now
that new enum values are used that are only present
in this version.
Signed-off-by: James Clark
Cc: John Garry
Cc: Will Deacon
Cc: Mathieu Poirier
Cc: Leo Yan
Cc: Suzuki K Poulose
Cc: Mike Leach
Cc: Al Grant
Cc: Peter Zijlstra
Cc: Ingo Molnar
t the cost of not showing
better frame pointer
stacks by default.
Tested-by: James Clark
On 04/03/2021 18:32, Alexandre Truong wrote:
> On arm64 and frame pointer mode (e.g: perf record --callgraph fp),
> use dwarf unwind info to check if the link register is the return
> address in ord
gs_mask = ((1ULL << PERF_REG_ARM64_MAX) -
> 1);
> return callchain_param.record_mode == CALLCHAIN_FP &&
> sample->user_regs.regs
> - && sample->user_regs.mask == PERF_REGS_MASK;
> + && sample->user_regs.mask
On 15/04/2021 15:39, Leo Yan wrote:
> On Wed, Apr 14, 2021 at 05:41:46PM +0300, James Clark wrote:
>> Hi,
>>
>> For this change, I also tried removing the setting of PERF_SAMPLE_TIME in
>> cs_etm__synth_events(). In theory, this would remove the sorting when
&
On 12/04/2021 12:10, Leo Yan wrote:
> The enum value 'ARM_SPE_PER_CPU_MMAPS' is never used so remove it.
Hi Leo,
I think this causes an error when attempting to open a newly recorded file
with an old version of perf. The value ARM_SPE_AUXTRACE_PRIV_MAX is used here:
size_t min_sz = si
Hi Leo,
I was looking at testing this on N1SDP and I thought I would try the round trip
with perf inject and
then perf report but saw that perf inject with SPE always results in an error
(unrelated to your change)
-> ./perf report -i per-thread-spe-time.inject.data
0x1328 [0x8]
On 12/04/2021 12:10, Leo Yan wrote:
> In current code, it assigns the arch timer counter to the synthesized
> samples Arm SPE trace, thus the samples don't contain the kernel time
> but only contain the raw counter value.
>
> To fix the issue, this patch converts the timer counter to kernel tim
On 15/04/2021 17:41, Leo Yan wrote:
> Hi James,
>
> On Thu, Apr 15, 2021 at 05:13:36PM +0300, James Clark wrote:
>> On 12/04/2021 12:10, Leo Yan wrote:
>>> The enum value 'ARM_SPE_PER_CPU_MMAPS' is never used so remove it.
>>
>> Hi Leo,
>>
On 15/04/2021 17:33, Leo Yan wrote:
> Hi James,
>
> On Thu, Apr 15, 2021 at 03:51:46PM +0300, James Clark wrote:
>
> [...]
>
>>> For the orignal perf data file with "--per-thread" option, the decoder
>>> runs into the condition for "etm-
On 15/04/2021 22:54, Mathieu Poirier wrote:
> On Wed, Apr 14, 2021 at 05:39:19PM +0300, James Clark wrote:
>> The following attribute is set when synthesising samples in
>> timed decoding mode:
>>
>> attr.sample_type |= PERF_SAMPLE_TIME;
>>
>> This res
Changes since v1:
* Improved variable name from etm_timestamp -> cs_timestamp
* Fixed ordering of Signed-off-by
James Clark (2):
perf cs-etm: Refactor timestamp variable names
perf cs-etm: Set time on synthesised samples to preserve ordering
.../perf/util/cs-etm-decoder/cs-etm-decode
tm__process_queues().
Co-developed-by: Al Grant
Signed-off-by: Al Grant
Signed-off-by: James Clark
---
tools/perf/util/cs-etm.c | 10 --
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 533f6f2f0685..e5c1a1b22a2a 100
refers to
sample kernel timestamps, and the /timestamp/ event modifier
refers to CS timestamps, so the term is overloaded.
Signed-off-by: James Clark
---
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 18
tools/perf/util/cs-etm.c | 42 +--
tools/perf
On 15/04/2021 18:23, Leo Yan wrote:
> On Thu, Apr 15, 2021 at 05:46:31PM +0300, James Clark wrote:
>>
>>
>> On 12/04/2021 12:10, Leo Yan wrote:
>>> In current code, it assigns the arch timer counter to the synthesized
>>> samples Arm SPE trace, thus th
refers to
sample kernel timestamps, and the /timestamp/ event modifier
refers to etm timestamps, so the term is overloaded.
Signed-off-by: James Clark
---
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 18
tools/perf/util/cs-etm.c | 42 +--
tools
to cs_etm__process_queues().
Signed-off-by: James Clark
Co-developed-by: Al Grant
Signed-off-by: Al Grant
---
tools/perf/util/cs-etm.c | 10 --
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index c25da2ffa8f3..d0fa9dce47f1 100
or maybe just checking the options, although that's not how it's done
in cs_etm__is_timeless_decoding() currently).
Or, we could force /time/ and /timestamp/ options to always be enabled together
in the record stage.
Thanks
James
On 14/04/2021 17:39, James Clark wrote:
> The follow
On 16/04/2021 18:16, Arnaldo Carvalho de Melo wrote:
> Em Fri, Apr 16, 2021 at 09:07:09AM -0600, Mathieu Poirier escreveu:
>> Hi James,
>>
>> On Fri, Apr 16, 2021 at 01:56:30PM +0300, James Clark wrote:
>>> Changes since v1:
>>> * Improved variable
Improve the topology test to check all aggregation
types. This is to lock down the behaviour before
'id' is changed into a struct in later commits.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Replace usages of perf_cpu_map with cpu_aggr map in
places that are involved with perf stat aggregation.
This will then later be changed to be a map of
cpu_aggr_id rather than an int so that more data can
be stored.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo
Use the new cpu_aggr_id struct in the cpu map
instead of int so that it can store more data.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas
Add core as a separate member so that it doesn't have to be
packed into the int value.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc: John
socket ID:
./perf stat --per-die -a
Performance counter stats for 'system wide':
S36-D0 128 169,869.39 msec cpu-clock #
127.501 CPUs utilized
...
S3612-D0 128 169,733.05 msec cpu-clock #
127.398 CPUs ut
Currently this is a duplicate of perf_cpu_map so that
it can be used as a drop in replacement.
In a later commit it will be changed from a map of ints
to use the new cpu_aggr_id struct.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho
x27;t have
to be changed to static in a separate commit
James Clark (12):
perf tools: Improve topology test
perf tools: Use allocator for perf_cpu_map
perf tools: Add new struct for cpu aggregation
perf tools: Replace aggregation ID with a struct
perf tools: add new map type for aggreg
ch is now
no longer used.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc: John Garry
---
tools/perf/tests/topology.c| 8
tools/perf
On 15/11/2020 23:17, Jiri Olsa wrote:
> On Fri, Nov 13, 2020 at 07:26:43PM +0200, James Clark wrote:
>> Use the existing allocator for perf_cpu_map to avoid use
>> of raw malloc. This could cause an issue in later commits
>> where the size of perf_cpu_map is changed.
>&g
On 15/11/2020 23:17, Jiri Olsa wrote:
> On Fri, Nov 13, 2020 at 07:26:45PM +0200, James Clark wrote:
>
> SNIP
>
>> @@ -754,7 +766,7 @@ static void print_aggr_thread(struct perf_stat_config
>> *config,
>> FILE *output = config->output;
>> int
This struct currently has only a single int member so that
it can be used as a drop in replacement for the existing
behaviour.
Comparison and constructor functions have also been added
that will replace usages of '==' and '= -1'.
No functional changes.
Signed-off-by: Ja
Use the existing allocator for perf_cpu_map to avoid use
of raw malloc. This could cause an issue in later commits
where the size of perf_cpu_map is changed.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc
Add node as a separate member so that it doesn't have to be
packed into the int value.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc: John
Add die as a separate member so that it doesn't have to be
packed into the int value.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc: John
Replace all occurences of the usage of int with the new struct
cpu_aggr_id.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc
On 15/11/2020 23:17, Jiri Olsa wrote:
> On Fri, Nov 13, 2020 at 07:26:48PM +0200, James Clark wrote:
>> These cpu_aggr_map refcounting functions are only used in
>> builtin-stat.c so their visibilty can be reduced to just
>> that file.
>>
>> No functional ch
On 15/11/2020 23:17, Jiri Olsa wrote:
> On Fri, Nov 13, 2020 at 07:26:54PM +0200, James Clark wrote:
>> A separate field isn't strictly required. The core
>> field could be re-used for thread IDs as a single
>> field was used previously.
>>
>> But separati
x40x006c
> x50x00100101
>... thread: ls:51956
> .. dso: /usr/lib64/ld-2.17.so
>
Checked that the registers can be listed with =? and that recording different
combinations of registers works as expected.
Tested-by: James Clark
small performance overhead when enabling
PID_IN_CONTEXTIDR, but SPE itself is optional and not enabled by
default so the impact is minimised.
Cc: Will Deacon
Cc: Mark Rutland
Cc: Al Grant
Cc: Leo Yan
Cc: John Garry
Cc: Suzuki K Poulose
Signed-off-by: James Clark
---
drivers/perf/Kconfig | 1 +
1
On 18/11/2020 13:21, Namhyung Kim wrote:
> Hello,
>
> On Tue, Nov 17, 2020 at 11:49 PM James Clark wrote:
>>
>> Improve the topology test to check all aggregation
>> types. This is to lock down the behaviour before
>> 'id' is changed into a struct i
Improve the topology test to check all aggregation
types. This is to lock down the behaviour before
'id' is changed into a struct in later commits.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Replace all occurences of the usage of int with the new struct
cpu_aggr_id.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc
Currently this is a duplicate of perf_cpu_map so that
it can be used as a drop in replacement.
In a later commit it will be changed from a map of ints
to use the new cpu_aggr_id struct.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho
Changes since v5:
* Fix test for cpu_map__get_die() by shifting id before testing.
* Fix test for cpu_map__get_socket() by not using cpu_map__id_to_socket()
which is only valid in CPU aggregation mode.
James Clark (12):
perf tools: Improve topology test
perf tools: Use allocator for
Use the existing allocator for perf_cpu_map to avoid use
of raw malloc. This could cause an issue in later commits
where the size of perf_cpu_map is changed.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc
This struct currently has only a single int member so that
it can be used as a drop in replacement for the existing
behaviour.
Comparison and constructor functions have also been added
that will replace usages of '==' and '= -1'.
No functional changes.
Signed-off-by: Ja
Use the new cpu_aggr_id struct in the cpu map
instead of int so that it can store more data.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas
Add node as a separate member so that it doesn't have to be
packed into the int value.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc: John
ch is now
no longer used.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc: John Garry
---
tools/perf/tests/topology.c| 8
tools/perf
Add core as a separate member so that it doesn't have to be
packed into the int value.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc: John
Replace usages of perf_cpu_map with cpu_aggr map in
places that are involved with perf stat aggregation.
This will then later be changed to be a map of
cpu_aggr_id rather than an int so that more data can
be stored.
No functional changes.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo
Add die as a separate member so that it doesn't have to be
packed into the int value.
Signed-off-by: James Clark
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Thomas Richter
Cc: John
socket ID:
./perf stat --per-die -a
Performance counter stats for 'system wide':
S36-D0 128 169,869.39 msec cpu-clock #
127.501 CPUs utilized
...
S3612-D0 128 169,733.05 msec cpu-clock #
127.398 CPUs ut
> Thanks,
> Mathieu
>
> On Fri, Feb 12, 2021 at 04:45:06PM +0200, James Clark wrote:
>> Hi All,
>>
>> Since my previous RFC, I've fixed --per-thread mode and solved
>> most of the open questions. I've also changed --dump-raw-trace
>> to use the
On 20/02/2021 13:50, Leo Yan wrote:
> On Fri, Feb 12, 2021 at 04:45:08PM +0200, James Clark wrote:
>> Change initial timestamp search to only operate on the queue
>> related to the current event. In a later change the bounds
>> of the aux record will also be used to reset t
On 27/02/2021 09:10, Leo Yan wrote:
> On Fri, Feb 12, 2021 at 04:45:09PM +0200, James Clark wrote:
>> The aux records will be used set the bounds of decoding in a
>> later commit. In the future we may also want to use the flags
>> of each record to control decoding.
>
On 24/01/2021 02:05, Jiri Olsa wrote:
> On Fri, Jan 22, 2021 at 04:18:54PM +, Alexandre Truong wrote:
>> On arm64 and frame pointer mode (e.g: perf record --callgraph fp),
>> use dwarf unwind info to check if the link register is the return
>> address in order to inject it to the frame point
small performance overhead when enabling
PID_IN_CONTEXTIDR, but SPE itself is optional and not enabled by
default so the impact is minimised.
Cc: Will Deacon
Cc: Mark Rutland
Cc: Al Grant
Cc: Leo Yan
Cc: John Garry
Cc: Suzuki K Poulose
Cc: Mathieu Poirier
Cc: Catalin Marinas
Signed-off-by: James
On 02/12/2020 01:09, Will Deacon wrote:
> On Tue, Dec 01, 2020 at 12:10:40PM +0800, Leo Yan wrote:
>> On Mon, Nov 30, 2020 at 04:46:51PM +, Will Deacon wrote:
>>> On Mon, Nov 30, 2020 at 06:24:54PM +0200, James Clark wrote:
>>>> Enable PID_IN_CONTEXTIDR by d
Hi,
I also had a look at this and had a question about the --spe option.
It seems that whatever options I give it, the output is the same:
perf report
And
perf report --spe=t
Both give the same result:
# Samples: 4 of event 'llc-miss'
# Event count (approx.): 4
running it.
Signed-off-by: James Clark
---
tools/perf/tests/shell/record+zstd_comp_decomp.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/shell/record+zstd_comp_decomp.sh
b/tools/perf/tests/shell/record+zstd_comp_decomp.sh
index 899604d1..63a91ec 100755
--
Resubmitting due to the previous patch having a disclaimer appended.
I've tested that this applies cleanly with git am.
James Clark (1):
perf tools: Add PMU event JSON files for ARM Cortex-A76 and, Neoverse
N1.
.../arch/arm64/arm/cortex-a76-n1/branch.json | 14 ++
.../arch/arm6
:
https://static.docs.arm.com/100798/0400/cortex_a76_trm_100798_0400_00_en.pdf
Signed-off-by: James Clark
Cc: Jeremy Linton
Cc: Suzuki Poulose
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Sorry about that, I will look into it.
Thanks
James
On 22/08/2019 22:24, Arnaldo Carvalho de Melo wrote:
> Em Thu, Aug 22, 2019 at 06:24:07PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Thu, Aug 22, 2019 at 01:55:15PM +0000, James Clark escreveu:
>>> Running 'perf test
Hi Xiaojun,
> By the way, you mentioned before that you want the spe event to be in the
> form of "event:pp" like pebs. Is that the whole framework should be made
> similar to pebs? Or is it just a modification to the command format?
We're currently still investigating if it makes sense to mod
Hi Xiaojun,
I wanted to ask if you are still working on this?
I've noticed that it doesn't apply cleanly to perf/core anymore and I was
working on re-basing it.
Would you be interested in me posting my progress?
I was also interested in decoding the "data source" of events and displaying
that
Hi Xiaojun,
>>
>> What do you mean when the user specifies "event:pp", if the SPE is
>> available, configure and record the spe data directly via the perf event
>> open syscall?
>> (perf.data itself is the same as using -e arm_spe_0//xxx?)
>
> I mean, for the perf record, if the user does not a
Hi,
I'm a developer at ARM and I'm new to upstreaming here. I'd like to submit this
patch for
event counters for two new ARM CPUs.
At some point in the future I will also continue work around arm-spe.c to
improve SPE support.
Thanks
James
IMPORTANT NOTICE: The contents of this email and any a
:
https://static.docs.arm.com/100798/0400/cortex_a76_trm_100798_0400_00_en.pdf
Signed-off-by: James Clark
---
.../arch/arm64/arm/cortex-a76-n1/branch.json | 14 ++
.../arch/arm64/arm/cortex-a76-n1/bus.json | 24 +++
.../arch/arm64/arm/cortex-a76-n1/cache.json| 207
1 - 100 of 141 matches
Mail list logo