Re: [V2 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-14 Thread Disha Goel

On 07/09/23 10:15 pm, Athira Rajeev wrote:


The testcase "Object code reading" fails in somecases
for "fs_something" sub test as below:

 Reading object code for memory address: 0xc00807f0142c
 File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
 On file address is: 0x1114cc
 Objdump command is: objdump -z -d --start-address=0x11142c 
--stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
 objdump read too few bytes: 128
 test child finished with -1

This can alo be reproduced when running perf record with
workload that exercises fs_something() code. In the test
setup, this is exercising xfs code since root is xfs.

 # perf record ./a.out
 # perf report -v |grep "xfs.ko"
   0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807de5efc B [k] xlog_cil_commit
   0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807d5ae18 B [k] xfs_btree_key_offset
   0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807e11fd4 B [k] 0x00112074

Here addr "0xc00807e11fd4" is not resolved. since this is a
kernel module, its offset is from the DSO. Xfs module is loaded
at 0xc00807d0

# cat /proc/modules | grep xfs
 xfs 2228224 3 - Live 0xc00807d0

And size is 0x22. So its loaded between  0xc00807d0
and 0xc00807f2. From objdump, text section is:
 text 0010f7bc    00a0 2**4

Hence perf captured ip maps to 0x112074 which is:
( ip - start of module ) + a0

This offset 0x112074 falls out .text section which is up to 0x10f7bc
In this case for module, the address 0xc00807e11fd4 is pointing
to stub instructions. This address range represents the module stubs
which is allocated on module load and hence is not part of DSO offset.

To address this issue in "object code reading", skip the sample if
address falls out of text section and is within the module end.
Use the "text_end" member of "struct dso" to do this check.

To address this issue in "perf report", exploring an option of
having stubs range as part of the /proc/kallsyms, so that perf
report can resolve addresses in stubs range

However this patch uses text_end to skip the stub range for
Object code reading testcase.

Reported-by: Disha Goel
Signed-off-by: Athira Rajeev
---
Changelog:
  v1 -> v2:
  Updated comment to add description on which arch has stub and
  reason for skipping as suggested by Adrian


With this patch applied perf Object code reading test works correctly.

 26: Object code reading : Ok

Tested-by: Disha Goel


  tools/perf/tests/code-reading.c | 12 
  1 file changed, 12 insertions(+)

diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index ed3815163d1b..3cf6c2d42416 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -269,6 +269,18 @@ static int read_object_code(u64 addr, size_t len, u8 
cpumode,
if (addr + len > map__end(al.map))
len = map__end(al.map) - addr;
  
+	/*

+* Some architectures (ex: powerpc) have stubs (trampolines) in kernel
+* modules to manage long jumps. Check if the ip offset falls in stubs
+* sections for kernel modules. And skip module address after text end
+*/
+   if (strstr(dso->long_name, ".ko")) {
+   if (al.addr > dso->text_end) {
+   pr_debug("skipping the module address %#"PRIx64" after text 
end\n", al.addr);
+   goto out;
+   }
+   }
+
/* Read the object code using perf */
ret_len = dso__data_read_offset(dso, 
maps__machine(thread__maps(thread)),
al.addr, buf1, len);

Re: [PATCH] tools/perf/arch/powerpc: Fix the CPU ID const char* value by adding 0x prefix

2023-10-10 Thread Disha Goel

On 09/10/23 10:30 am, Athira Rajeev wrote:


Simple expression parser test fails in powerpc as below:

 4: Simple expression parser
 test child forked, pid 170385
 Using CPUID 004e2102
 division by zero
 syntax error
 syntax error
 FAILED tests/expr.c:65 parse test failed
 test child finished with -1
 Simple expression parser: FAILED!

This is observed after commit:
'commit 9d5da30e4ae9 ("perf jevents: Add a new expression builtin 
strcmp_cpuid_str()")'

With this commit, a new expression builtin strcmp_cpuid_str
got added. This function takes an 'ID' type value, which is
a string. So expression parse for strcmp_cpuid_str expects
const char * as cpuid value type. In case of powerpc, CPU IDs
are numbers. Hence it doesn't get interpreted correctly by
bison parser. Example in case of power9, cpuid string returns
as: 004e2102

cpuid of string type is expected in two cases:
1. char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused);

Testcase "tests/expr.c" uses "perf_pmu__getcpuid" which calls
get_cpuid_str to get the cpuid string.

2. cpuid field in  :struct pmu_events_map

struct pmu_events_map {
const char *arch;
   const char *cpuid;

Here cpuid field is used in "perf_pmu__find_events_table"
function as "strcmp_cpuid_str(map->cpuid, cpuid)". The
value for cpuid field is picked from mapfile.csv.

Fix the mapfile.csv and get_cpuid_str function to prefix
cpuid with 0x so that it gets correctly interpreted by
the bison parser

Signed-off-by: Athira Rajeev


Tested the patch on Power10 machine, now its correctly able to interpret cpuid
value and perf Simple expression parser test passed.

# ./perf test -v 7
  7: Simple expression parser:
--- start ---
test child forked, pid 87922
Using CPUID 0x00800200
division by zero
syntax error
test child finished with 0
 end 
Simple expression parser: Ok

Tested-by: Disha Goel


---
  tools/perf/arch/powerpc/util/header.c  | 2 +-
  tools/perf/pmu-events/arch/powerpc/mapfile.csv | 6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index c8d0dc775e5d..6b00efd53638 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -34,7 +34,7 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
  {
char *bufp;

-   if (asprintf(&bufp, "%.8lx", mfspr(SPRN_PVR)) < 0)
+   if (asprintf(&bufp, "0x%.8lx", mfspr(SPRN_PVR)) < 0)
bufp = NULL;

return bufp;
diff --git a/tools/perf/pmu-events/arch/powerpc/mapfile.csv 
b/tools/perf/pmu-events/arch/powerpc/mapfile.csv
index a534ff6db14b..f4908af7ad66 100644
--- a/tools/perf/pmu-events/arch/powerpc/mapfile.csv
+++ b/tools/perf/pmu-events/arch/powerpc/mapfile.csv
@@ -13,6 +13,6 @@
  #

  # Power8 entries
-004[bcd][[:xdigit:]]{4},1,power8,core
-004e[[:xdigit:]]{4},1,power9,core
-0080[[:xdigit:]]{4},1,power10,core
+0x004[bcd][[:xdigit:]]{4},1,power8,core
+0x004e[[:xdigit:]]{4},1,power9,core
+0x0080[[:xdigit:]]{4},1,power10,core

Re: [PATCH V2 2/2] tools/perf/tests: perf all metricgroups test fails when perf_event access is restricted

2023-08-07 Thread Disha Goel

On 04/08/23 10:30 am, Athira Rajeev wrote:

Perf all metricgroups test fails as below when perf_event access
is restricted.

 ./perf test -v "perf all metricgroups test"
 Testing Memory_BW
 Error:
 Access to performance monitoring and observability operations is limited.
 Enforced MAC policy settings (SELinux) can limit access to performance
 access to performance monitoring and observability operations for processes
 without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.

 test child finished with -1
  end 
 perf all metricgroups test: FAILED!

Fix the testcase to skip those metric events which needs perf_event access
explicitly. The exit code of the testcase is based on return code of
the perf stat command ( enabled by set -e option ). Hence save the
exit status in a variable and use that to decide success or fail for the
testcase.

Signed-off-by: Athira Rajeev


With this patch applied(on power) perf metricgroups test works correctly when 
perf_event access is restricted.

 # ./perf test "perf all metricgroups test"
 96: perf all metricgroups test      : Ok

Tested-by: Disha Goel


---
Changelog:
v1 -> v2:
  Changed the condition to use "echo" and "grep" so it works on
  Posix shell as well.

  tools/perf/tests/shell/stat_all_metricgroups.sh | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/tools/perf/tests/shell/stat_all_metricgroups.sh 
b/tools/perf/tests/shell/stat_all_metricgroups.sh
index cb35e488809a..eaa5e1172294 100755
--- a/tools/perf/tests/shell/stat_all_metricgroups.sh
+++ b/tools/perf/tests/shell/stat_all_metricgroups.sh
@@ -2,11 +2,19 @@
  # perf all metricgroups test
  # SPDX-License-Identifier: GPL-2.0

-set -e
-
  for m in $(perf list --raw-dump metricgroups); do
echo "Testing $m"
-  perf stat -M "$m" -a true
+  result=$(perf stat -M "$m" -a true 2>&1)
+  rc=$?
+  # Skip if there is no access to perf_events monitoring
+  # Otherwise exit based on the return code of perf comamnd.
+  if echo "$result" | grep -q "Access to performance monitoring and observability 
operations is limited";
+  then
+  continue
+  else
+  [ $rc -ne 0 ] && exit $rc
+  fi
+
  done

  exit 0

Re: [PATCH] perf test: Skip perf bench breakpoint run if no breakpoints available

2023-08-23 Thread Disha Goel

On 23/08/23 1:21 pm, Kajol Jain wrote:


Based on commit 7d54a4acd8c1 ("perf test: Skip watchpoint
tests if no watchpoints available"), hardware breakpoints
are not available for power9 platform and because of that
perf bench breakpoint run fails on power9 platform.
Add code to check for the return value of perf_event_open()
in breakpoint run and skip the perf bench breakpoint run,
if hardware breakpoints are not available.

Result on power9 system before patch changes:
[command]# perf bench breakpoint thread
perf_event_open: No such device

Result on power9 system after patch changes:
[command]# ./perf bench breakpoint thread
Skipping perf bench breakpoint thread: No hardware support

Reported-by: Disha Goel
Signed-off-by: Kajol Jain


With this patch applied perf bench breakpoint test works correctly on Power9 
and Power10.

Result on Power9 system with patch changes:
[root@]# ./perf bench breakpoint all
# Running breakpoint/thread benchmark...
Skipping perf bench breakpoint thread: No hardware support

# Running breakpoint/enable benchmark...
Skipping perf bench breakpoint enable: No hardware support

Result on Power10 system with patch changes:
[root@]# ./perf bench breakpoint all
# Running breakpoint/thread benchmark...
# Created/joined 10 threads with 1 breakpoints and 1 parallelism
 Total time: 0.001 [sec]

 115.10 usecs/op
 115.10 usecs/op/cpu

# Running breakpoint/enable benchmark...
# Enabled/disabled breakpoint 10 time with 0 passive and 0 active threads
 Total time: 0.000 [sec]

   2.80 usecs/op

Tested-by: Disha Goel


---
  tools/perf/bench/breakpoint.c | 24 +---
  1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/tools/perf/bench/breakpoint.c b/tools/perf/bench/breakpoint.c
index 41385f89ffc7..dfd18f5db97d 100644
--- a/tools/perf/bench/breakpoint.c
+++ b/tools/perf/bench/breakpoint.c
@@ -47,6 +47,7 @@ struct breakpoint {
  static int breakpoint_setup(void *addr)
  {
struct perf_event_attr attr = { .size = 0, };
+   int fd;
  
  	attr.type = PERF_TYPE_BREAKPOINT;

attr.size = sizeof(attr);
@@ -56,7 +57,12 @@ static int breakpoint_setup(void *addr)
attr.bp_addr = (unsigned long)addr;
attr.bp_type = HW_BREAKPOINT_RW;
attr.bp_len = HW_BREAKPOINT_LEN_1;
-   return syscall(SYS_perf_event_open, &attr, 0, -1, -1, 0);
+   fd = syscall(SYS_perf_event_open, &attr, 0, -1, -1, 0);
+
+   if (fd < 0)
+   fd = -errno;
+
+   return fd;
  }
  
  static void *passive_thread(void *arg)

@@ -122,8 +128,14 @@ int bench_breakpoint_thread(int argc, const char **argv)
  
  	for (i = 0; i < thread_params.nbreakpoints; i++) {

breakpoints[i].fd = breakpoint_setup(&breakpoints[i].watched);
-   if (breakpoints[i].fd == -1)
+
+   if (breakpoints[i].fd < 0) {
+   if (breakpoints[i].fd == -ENODEV) {
+   printf("Skipping perf bench breakpoint thread: No 
hardware support\n");
+   return 0;
+   }
exit((perror("perf_event_open"), EXIT_FAILURE));
+   }
}
gettimeofday(&start, NULL);
for (i = 0; i < thread_params.nparallel; i++) {
@@ -196,8 +208,14 @@ int bench_breakpoint_enable(int argc, const char **argv)
exit(EXIT_FAILURE);
}
fd = breakpoint_setup(&watched);
-   if (fd == -1)
+
+   if (fd < 0) {
+   if (fd == -ENODEV) {
+   printf("Skipping perf bench breakpoint enable: No hardware 
support\n");
+   return 0;
+   }
exit((perror("perf_event_open"), EXIT_FAILURE));
+   }
nthreads = enable_params.npassive + enable_params.nactive;
threads = calloc(nthreads, sizeof(threads[0]));
if (!threads)

Re: [PATCH] powerpc/perf/hv-24x7: Update domain value check

2023-08-25 Thread Disha Goel

On 25/08/23 11:26 am, Kajol Jain wrote:


Valid domain value is in range 1 to HV_PERF_DOMAIN_MAX.
Current code has check for domain value greater than or
equal to HV_PERF_DOMAIN_MAX. But the check for domain value 0
is missing.
Fix this issue by adding check for domain value 0.

Fixes: ebd4a5a3ebd9 ("powerpc/perf/hv-24x7: Minor improvements")
Reported-by: Krishan Gopal Sarawast  
Signed-off-by: Kajol Jain


Tested the patch on power, with domain=0 not getting hcall failure after patch 
changes.

Before patch changes:
# ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1
Using CPUID 00800200
Control descriptor is not initialized
Error:
The sys_perf_event_open() syscall returned with 5 (Input/output error) for
event (hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/).
/bin/dmesg | grep -i perf may provide additional information.

Result from dmesg:
[   37.819387] hv-24x7: hcall failed: [0 0x6004 0x100 0] => ret
0xfffc (-4) detail=0x200 failing ix=0

After patch changes:
# ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1
Using CPUID 00800200
Control descriptor is not initialized
Warning:
hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ event is not supported by the kernel.
failed to read counter hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/

 Performance counter stats for 'system wide':

 hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/

   1.001352771 seconds time elapsed

Tested-by: Disha Goel


---
  arch/powerpc/perf/hv-24x7.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 317175791d23..644881cc1c00 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1418,7 +1418,7 @@ static int h_24x7_event_init(struct perf_event *event)
}
  
  	domain = event_get_domain(event);

-   if (domain >= HV_PERF_DOMAIN_MAX) {
+   if (domain  == 0 || domain >= HV_PERF_DOMAIN_MAX) {
pr_devel("invalid domain %d\n", domain);
return -EINVAL;
}

[PATCH 0/4] Remove unused macros from asm-offset

2022-09-13 Thread Disha Goel
There were commits which did code refactoring and converting some of kvm
assembly routines to C. When doing it, many of the asm-offset macro
definitions were missed to remove. Patchset here removes those.

Patch1 removes usage of KVM_TLB_SETS macro from the asm-offset

Patch2 removes KVM_RADIX macro from the asm-offset.c

Patch3 removes a set of unused kvm vcpu and hstate macros from the
asm-offset.c

Patch4 removes unused HSTATE/host_mmcr references for MMCR3/SIER2/SIER3

Link to the script used to get unused macro:
https://github.com/maddy-kerneldev/scripts/blob/master/check_asm-offset.sh

Link to linux-ci job result:
https://github.com/disgoel/linux-ci/actions

Disha Goel (3):
  powerpc/asm-offset: Remove unused KVM_TLB_SETS macros
  powerpc/asm-offset: Remove unused KVM_RADIX macros
  powerpc/kvm: Remove unused macros from asm-offset

Kajol Jain (1):
  powerpc/kvm: Remove unused references for MMCR3/SIER2/SIER3 registers

 arch/powerpc/include/asm/kvm_book3s_asm.h |  2 +-
 arch/powerpc/kernel/asm-offsets.c | 25 ---
 2 files changed, 1 insertion(+), 26 deletions(-)

-- 
2.31.1



[PATCH 1/4] powerpc/asm-offset: Remove unused KVM_TLB_SETS macros

2022-09-13 Thread Disha Goel
Commit 2e1ae9cd56f8 ("KVM: PPC: Book3S HV: Implement radix
prefetch workaround by disabling MMU") removed usage of
KVM_TLB_SETS macro as part of fixing radix prefetch workaround.
But the patch missed to remove the asm-offset macro definition
which was used in the assembly files and now this is un-used.
Patch fixes by removing this.

Fixes: 2e1ae9cd56f8 ("KVM: PPC: Book3S HV: Implement radix prefetch workaround 
by disabling MMU")
Signed-off-by: Disha Goel 
---
 arch/powerpc/kernel/asm-offsets.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 8c10f536e478..d5d9c1dcef1c 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -418,7 +418,6 @@ int main(void)
 
/* book3s */
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
-   OFFSET(KVM_TLB_SETS, kvm, arch.tlb_sets);
OFFSET(KVM_SDR1, kvm, arch.sdr1);
OFFSET(KVM_HOST_LPID, kvm, arch.host_lpid);
OFFSET(KVM_HOST_LPCR, kvm, arch.host_lpcr);
-- 
2.31.1



[PATCH 2/4] powerpc/asm-offset: Remove unused KVM_RADIX macros

2022-09-13 Thread Disha Goel
Commit 9769a7fd79b6 ("KVM: PPC: Book3S HV: Remove radix
guest support from P7/8 path") removed the radix guest
support from P7/8 path since the P9 path now supports of
all the supported radix guest. But missed to remove
KVM_RADIX from the asm-offset. Patch fixes by removing it.

Fixes: 9769a7fd79b6 ("KVM: PPC: Book3S HV: Remove radix guest support from P7/8 
path")
Signed-off-by: Disha Goel 
---
 arch/powerpc/kernel/asm-offsets.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index d5d9c1dcef1c..88a329d74d23 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -424,7 +424,6 @@ int main(void)
OFFSET(KVM_HOST_SDR1, kvm, arch.host_sdr1);
OFFSET(KVM_ENABLED_HCALLS, kvm, arch.enabled_hcalls);
OFFSET(KVM_VRMA_SLB_V, kvm, arch.vrma_slb_v);
-   OFFSET(KVM_RADIX, kvm, arch.radix);
OFFSET(KVM_SECURE_GUEST, kvm, arch.secure_guest);
OFFSET(VCPU_DSISR, kvm_vcpu, arch.shregs.dsisr);
OFFSET(VCPU_DAR, kvm_vcpu, arch.shregs.dar);
-- 
2.31.1



[PATCH 3/4] powerpc/kvm: Remove unused macros from asm-offset

2022-09-13 Thread Disha Goel
Commit fae5c9f3664b ("KVM: PPC: Book3S HV: remove ISA v3.0 and v3.1
support from P7/8 path") moved code path for the kvm guest entry/exit
for p7/8 from aseembly to C. But the patch missed to remove the
asm-offset macro definitions which were used in the assembly files
and now they are un-used. Patch fixes by removing those.

Fixes: fae5c9f3664b ("KVM: PPC: Book3S HV: remove ISA v3.0 and v3.1 support 
from P7/8 path")
Signed-off-by: Disha Goel 
---
 arch/powerpc/kernel/asm-offsets.c | 20 
 1 file changed, 20 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 88a329d74d23..90235288cc1a 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -430,7 +430,6 @@ int main(void)
OFFSET(VCPU_VPA, kvm_vcpu, arch.vpa.pinned_addr);
OFFSET(VCPU_VPA_DIRTY, kvm_vcpu, arch.vpa.dirty);
OFFSET(VCPU_HEIR, kvm_vcpu, arch.emul_inst);
-   OFFSET(VCPU_NESTED, kvm_vcpu, arch.nested);
OFFSET(VCPU_CPU, kvm_vcpu, cpu);
OFFSET(VCPU_THREAD_CPU, kvm_vcpu, arch.thread_cpu);
 #endif
@@ -447,16 +446,12 @@ int main(void)
OFFSET(VCPU_DABRX, kvm_vcpu, arch.dabrx);
OFFSET(VCPU_DAWR0, kvm_vcpu, arch.dawr0);
OFFSET(VCPU_DAWRX0, kvm_vcpu, arch.dawrx0);
-   OFFSET(VCPU_DAWR1, kvm_vcpu, arch.dawr1);
-   OFFSET(VCPU_DAWRX1, kvm_vcpu, arch.dawrx1);
OFFSET(VCPU_CIABR, kvm_vcpu, arch.ciabr);
OFFSET(VCPU_HFLAGS, kvm_vcpu, arch.hflags);
OFFSET(VCPU_DEC_EXPIRES, kvm_vcpu, arch.dec_expires);
OFFSET(VCPU_PENDING_EXC, kvm_vcpu, arch.pending_exceptions);
OFFSET(VCPU_CEDED, kvm_vcpu, arch.ceded);
OFFSET(VCPU_PRODDED, kvm_vcpu, arch.prodded);
-   OFFSET(VCPU_IRQ_PENDING, kvm_vcpu, arch.irq_pending);
-   OFFSET(VCPU_DBELL_REQ, kvm_vcpu, arch.doorbell_request);
OFFSET(VCPU_MMCR, kvm_vcpu, arch.mmcr);
OFFSET(VCPU_MMCRA, kvm_vcpu, arch.mmcra);
OFFSET(VCPU_MMCRS, kvm_vcpu, arch.mmcrs);
@@ -484,8 +479,6 @@ int main(void)
OFFSET(VCPU_TCSCR, kvm_vcpu, arch.tcscr);
OFFSET(VCPU_ACOP, kvm_vcpu, arch.acop);
OFFSET(VCPU_WORT, kvm_vcpu, arch.wort);
-   OFFSET(VCPU_TID, kvm_vcpu, arch.tid);
-   OFFSET(VCPU_PSSCR, kvm_vcpu, arch.psscr);
OFFSET(VCPU_HFSCR, kvm_vcpu, arch.hfscr);
OFFSET(VCORE_ENTRY_EXIT, kvmppc_vcore, entry_exit_map);
OFFSET(VCORE_IN_GUEST, kvmppc_vcore, in_guest);
@@ -580,8 +573,6 @@ int main(void)
HSTATE_FIELD(HSTATE_HWTHREAD_STATE, hwthread_state);
HSTATE_FIELD(HSTATE_KVM_VCPU, kvm_vcpu);
HSTATE_FIELD(HSTATE_KVM_VCORE, kvm_vcore);
-   HSTATE_FIELD(HSTATE_XIVE_TIMA_PHYS, xive_tima_phys);
-   HSTATE_FIELD(HSTATE_XIVE_TIMA_VIRT, xive_tima_virt);
HSTATE_FIELD(HSTATE_HOST_IPI, host_ipi);
HSTATE_FIELD(HSTATE_PTID, ptid);
HSTATE_FIELD(HSTATE_FAKE_SUSPEND, fake_suspend);
@@ -670,17 +661,6 @@ int main(void)
OFFSET(VCPU_HOST_MAS6, kvm_vcpu, arch.host_mas6);
 #endif
 
-#ifdef CONFIG_KVM_XICS
-   DEFINE(VCPU_XIVE_SAVED_STATE, offsetof(struct kvm_vcpu,
-  arch.xive_saved_state));
-   DEFINE(VCPU_XIVE_CAM_WORD, offsetof(struct kvm_vcpu,
-   arch.xive_cam_word));
-   DEFINE(VCPU_XIVE_PUSHED, offsetof(struct kvm_vcpu, arch.xive_pushed));
-   DEFINE(VCPU_XIVE_ESC_ON, offsetof(struct kvm_vcpu, arch.xive_esc_on));
-   DEFINE(VCPU_XIVE_ESC_RADDR, offsetof(struct kvm_vcpu, 
arch.xive_esc_raddr));
-   DEFINE(VCPU_XIVE_ESC_VADDR, offsetof(struct kvm_vcpu, 
arch.xive_esc_vaddr));
-#endif
-
 #ifdef CONFIG_KVM_EXIT_TIMING
OFFSET(VCPU_TIMING_EXIT_TBU, kvm_vcpu, arch.timing_exit.tv32.tbu);
OFFSET(VCPU_TIMING_EXIT_TBL, kvm_vcpu, arch.timing_exit.tv32.tbl);
-- 
2.31.1



[PATCH 4/4] powerpc/kvm: Remove unused references for MMCR3/SIER2/SIER3 registers

2022-09-13 Thread Disha Goel
From: Kajol Jain 

Commit 57dc0eed73ca ("KVM: PPC: Book3S HV P9: Implement PMU save/restore
in C") removed the PMU save/restore functions from assembly code and
implemented these functions in C, for power9 and later platforms.

After the code refactoring, Performance Monitoring Unit (PMU) registers
became part of "p9_host_os_sprs" structure, and now this structure is
used to save/restore pmu host registers for power9 and later platfroms.
But we still have old unused registers references, patch removes unused
HSTATE/host_mmcr references for Monitor Mode Control Register 3 (MMCR3)/
Sampled Instruction Event Register 2 (SIER2)/ SIER3 registers from
"struct kvmppc_host_state".

Fixes: 57dc0eed73ca ("KVM: PPC: Book3S HV P9: Implement PMU save/restore in C")
Signed-off-by: Kajol Jain 
---
 arch/powerpc/include/asm/kvm_book3s_asm.h | 2 +-
 arch/powerpc/kernel/asm-offsets.c | 3 ---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index c8882d9b86c2..a36797938620 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -105,7 +105,7 @@ struct kvmppc_host_state {
void __iomem *xive_tima_virt;
u32 saved_xirr;
u64 dabr;
-   u64 host_mmcr[10];  /* MMCR 0,1,A, SIAR, SDAR, MMCR2, SIER, MMCR3, 
SIER2/3 */
+   u64 host_mmcr[7];   /* MMCR 0,1,A, SIAR, SDAR, MMCR2, SIER */
u32 host_pmc[8];
u64 host_purr;
u64 host_spurr;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 90235288cc1a..95744e7c056d 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -583,9 +583,6 @@ int main(void)
HSTATE_FIELD(HSTATE_SDAR, host_mmcr[4]);
HSTATE_FIELD(HSTATE_MMCR2, host_mmcr[5]);
HSTATE_FIELD(HSTATE_SIER, host_mmcr[6]);
-   HSTATE_FIELD(HSTATE_MMCR3, host_mmcr[7]);
-   HSTATE_FIELD(HSTATE_SIER2, host_mmcr[8]);
-   HSTATE_FIELD(HSTATE_SIER3, host_mmcr[9]);
HSTATE_FIELD(HSTATE_PMC1, host_pmc[0]);
HSTATE_FIELD(HSTATE_PMC2, host_pmc[1]);
HSTATE_FIELD(HSTATE_PMC3, host_pmc[2]);
-- 
2.31.1



Re: [PATCH 0/4] Remove unused macros from asm-offset

2022-09-14 Thread Disha Goel



On 9/13/22 1:45 PM, Christophe Leroy wrote:


Le 13/09/2022 à 09:40, Disha Goel a écrit :

There were commits which did code refactoring and converting some of kvm
assembly routines to C. When doing it, many of the asm-offset macro
definitions were missed to remove. Patchset here removes those.

Patch1 removes usage of KVM_TLB_SETS macro from the asm-offset

Patch2 removes KVM_RADIX macro from the asm-offset.c

Patch3 removes a set of unused kvm vcpu and hstate macros from the
asm-offset.c

Patch4 removes unused HSTATE/host_mmcr references for MMCR3/SIER2/SIER3

I think you can squash all changes to asm-offsets.c into a single patch.
The Fixes: tags are unrelevant, you are not fixing a real bug, it's just
a cleanup.

Then have a second patch that reduces the size of host_mmcr[] in
kvmppc_host_state struct.


Hi Christophe,

Thanks for reviewing the patchset. I will send v2 patchset with 
mentioned changes.



Thanks

Disha


Thanks
Christophe


Link to the script used to get unused macro:
https://github.com/maddy-kerneldev/scripts/blob/master/check_asm-offset.sh

Link to linux-ci job result:
https://github.com/disgoel/linux-ci/actions

Disha Goel (3):
powerpc/asm-offset: Remove unused KVM_TLB_SETS macros
powerpc/asm-offset: Remove unused KVM_RADIX macros
powerpc/kvm: Remove unused macros from asm-offset

Kajol Jain (1):
powerpc/kvm: Remove unused references for MMCR3/SIER2/SIER3 registers

   arch/powerpc/include/asm/kvm_book3s_asm.h |  2 +-
   arch/powerpc/kernel/asm-offsets.c | 25 ---
   2 files changed, 1 insertion(+), 26 deletions(-)


[PATCH v2 0/2] Remove unused macros from asm-offset

2022-09-16 Thread Disha Goel
The kvm code was refactored to convert some of kvm assembly routines to C.
This includes commits which moved code path for the kvm guest entry/exit
for p7/8 from aseembly to C. As part of the code changes, usage of some of
the macros were removed. But definitions still exist in the assembly files.
Commits are listed below:

Commit 2e1ae9cd56f8 ("KVM: PPC: Book3S HV: Implement radix prefetch workaround 
by disabling MMU")
Commit 9769a7fd79b6 ("KVM: PPC: Book3S HV: Remove radix guest support from P7/8 
path")
Commit fae5c9f3664b ("KVM: PPC: Book3S HV: remove ISA v3.0 and v3.1 support 
from P7/8 path")
Commit 57dc0eed73ca ("KVM: PPC: Book3S HV P9: Implement PMU save/restore in C")

Many of the asm-offset macro definitions were missed to remove. This patchset
fixes by removing the unused macro definitions.

Patch1 removes a set of unused kvm vcpu and hstate macros from the
asm-offset.c

Patch2 removes unused host_mmcr references for MMCR3/SIER2/SIER3

Changelog:
v1 -> v2:
Merge all the macro removal part from asm-offset.c file into a single patch
as suggested by Christophe Leroy.

Link to patchset v1:
http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220913074025.132160-2-disg...@linux.vnet.ibm.com/

Link to the script used to get unused macro:
https://github.com/maddy-kerneldev/scripts/blob/master/check_asm-offset.sh

Link to linux-ci job result:
https://github.com/disgoel/linux-ci/actions?query=branch%3Akvmmacro.v2

Disha Goel (1):
  powerpc/kvm: Remove unused macros from asm-offset

Kajol Jain (1):
  powerpc/kvm: Remove unused references for MMCR3/SIER2/SIER3 registers

 arch/powerpc/include/asm/kvm_book3s_asm.h |  2 +-
 arch/powerpc/kernel/asm-offsets.c | 25 ---
 2 files changed, 1 insertion(+), 26 deletions(-)

-- 
2.31.1



[PATCH v2 1/2] powerpc/kvm: Remove unused macros from asm-offset

2022-09-16 Thread Disha Goel
The kvm code was refactored to convert some of kvm assembly routines to C.
This includes commits which moved code path for the kvm guest entry/exit
for p7/8 from aseembly to C. As part of the code changes, usage of some of
the macros were removed. But definitions still exist in the assembly files.
Commits are listed below:

Commit 2e1ae9cd56f8 ("KVM: PPC: Book3S HV: Implement radix prefetch workaround 
by disabling MMU")
Commit 9769a7fd79b6 ("KVM: PPC: Book3S HV: Remove radix guest support from P7/8 
path")
Commit fae5c9f3664b ("KVM: PPC: Book3S HV: remove ISA v3.0 and v3.1 support 
from P7/8 path")
Commit 57dc0eed73ca ("KVM: PPC: Book3S HV P9: Implement PMU save/restore in C")

Many of the asm-offset macro definitions were missed to remove. Patch
fixes by removing the unused macros.

Signed-off-by: Disha Goel 
---
 arch/powerpc/kernel/asm-offsets.c | 25 -
 1 file changed, 25 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 8c10f536e478..95744e7c056d 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -418,21 +418,18 @@ int main(void)
 
/* book3s */
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
-   OFFSET(KVM_TLB_SETS, kvm, arch.tlb_sets);
OFFSET(KVM_SDR1, kvm, arch.sdr1);
OFFSET(KVM_HOST_LPID, kvm, arch.host_lpid);
OFFSET(KVM_HOST_LPCR, kvm, arch.host_lpcr);
OFFSET(KVM_HOST_SDR1, kvm, arch.host_sdr1);
OFFSET(KVM_ENABLED_HCALLS, kvm, arch.enabled_hcalls);
OFFSET(KVM_VRMA_SLB_V, kvm, arch.vrma_slb_v);
-   OFFSET(KVM_RADIX, kvm, arch.radix);
OFFSET(KVM_SECURE_GUEST, kvm, arch.secure_guest);
OFFSET(VCPU_DSISR, kvm_vcpu, arch.shregs.dsisr);
OFFSET(VCPU_DAR, kvm_vcpu, arch.shregs.dar);
OFFSET(VCPU_VPA, kvm_vcpu, arch.vpa.pinned_addr);
OFFSET(VCPU_VPA_DIRTY, kvm_vcpu, arch.vpa.dirty);
OFFSET(VCPU_HEIR, kvm_vcpu, arch.emul_inst);
-   OFFSET(VCPU_NESTED, kvm_vcpu, arch.nested);
OFFSET(VCPU_CPU, kvm_vcpu, cpu);
OFFSET(VCPU_THREAD_CPU, kvm_vcpu, arch.thread_cpu);
 #endif
@@ -449,16 +446,12 @@ int main(void)
OFFSET(VCPU_DABRX, kvm_vcpu, arch.dabrx);
OFFSET(VCPU_DAWR0, kvm_vcpu, arch.dawr0);
OFFSET(VCPU_DAWRX0, kvm_vcpu, arch.dawrx0);
-   OFFSET(VCPU_DAWR1, kvm_vcpu, arch.dawr1);
-   OFFSET(VCPU_DAWRX1, kvm_vcpu, arch.dawrx1);
OFFSET(VCPU_CIABR, kvm_vcpu, arch.ciabr);
OFFSET(VCPU_HFLAGS, kvm_vcpu, arch.hflags);
OFFSET(VCPU_DEC_EXPIRES, kvm_vcpu, arch.dec_expires);
OFFSET(VCPU_PENDING_EXC, kvm_vcpu, arch.pending_exceptions);
OFFSET(VCPU_CEDED, kvm_vcpu, arch.ceded);
OFFSET(VCPU_PRODDED, kvm_vcpu, arch.prodded);
-   OFFSET(VCPU_IRQ_PENDING, kvm_vcpu, arch.irq_pending);
-   OFFSET(VCPU_DBELL_REQ, kvm_vcpu, arch.doorbell_request);
OFFSET(VCPU_MMCR, kvm_vcpu, arch.mmcr);
OFFSET(VCPU_MMCRA, kvm_vcpu, arch.mmcra);
OFFSET(VCPU_MMCRS, kvm_vcpu, arch.mmcrs);
@@ -486,8 +479,6 @@ int main(void)
OFFSET(VCPU_TCSCR, kvm_vcpu, arch.tcscr);
OFFSET(VCPU_ACOP, kvm_vcpu, arch.acop);
OFFSET(VCPU_WORT, kvm_vcpu, arch.wort);
-   OFFSET(VCPU_TID, kvm_vcpu, arch.tid);
-   OFFSET(VCPU_PSSCR, kvm_vcpu, arch.psscr);
OFFSET(VCPU_HFSCR, kvm_vcpu, arch.hfscr);
OFFSET(VCORE_ENTRY_EXIT, kvmppc_vcore, entry_exit_map);
OFFSET(VCORE_IN_GUEST, kvmppc_vcore, in_guest);
@@ -582,8 +573,6 @@ int main(void)
HSTATE_FIELD(HSTATE_HWTHREAD_STATE, hwthread_state);
HSTATE_FIELD(HSTATE_KVM_VCPU, kvm_vcpu);
HSTATE_FIELD(HSTATE_KVM_VCORE, kvm_vcore);
-   HSTATE_FIELD(HSTATE_XIVE_TIMA_PHYS, xive_tima_phys);
-   HSTATE_FIELD(HSTATE_XIVE_TIMA_VIRT, xive_tima_virt);
HSTATE_FIELD(HSTATE_HOST_IPI, host_ipi);
HSTATE_FIELD(HSTATE_PTID, ptid);
HSTATE_FIELD(HSTATE_FAKE_SUSPEND, fake_suspend);
@@ -594,9 +583,6 @@ int main(void)
HSTATE_FIELD(HSTATE_SDAR, host_mmcr[4]);
HSTATE_FIELD(HSTATE_MMCR2, host_mmcr[5]);
HSTATE_FIELD(HSTATE_SIER, host_mmcr[6]);
-   HSTATE_FIELD(HSTATE_MMCR3, host_mmcr[7]);
-   HSTATE_FIELD(HSTATE_SIER2, host_mmcr[8]);
-   HSTATE_FIELD(HSTATE_SIER3, host_mmcr[9]);
HSTATE_FIELD(HSTATE_PMC1, host_pmc[0]);
HSTATE_FIELD(HSTATE_PMC2, host_pmc[1]);
HSTATE_FIELD(HSTATE_PMC3, host_pmc[2]);
@@ -672,17 +658,6 @@ int main(void)
OFFSET(VCPU_HOST_MAS6, kvm_vcpu, arch.host_mas6);
 #endif
 
-#ifdef CONFIG_KVM_XICS
-   DEFINE(VCPU_XIVE_SAVED_STATE, offsetof(struct kvm_vcpu,
-  arch.xive_saved_state));
-   DEFINE(VCPU_XIVE_CAM_WORD, offsetof(struct kvm_vcpu,
-   arch.xive_cam_word));
-   DEFINE(VCPU_XIVE_PUSHED, offsetof(struct kvm_vcpu, arch.xive_pushed));
-   DEFINE(VCPU_XIVE_ESC_ON, offsetof(struct kvm_

[PATCH v2 2/2] powerpc/kvm: Remove unused references for MMCR3/SIER2/SIER3 registers

2022-09-16 Thread Disha Goel
From: Kajol Jain" 

Commit 57dc0eed73ca ("KVM: PPC: Book3S HV P9: Implement PMU save/restore
in C") removed the PMU save/restore functions from assembly code and
implemented these functions in C, for power9 and later platforms.

After the code refactoring, Performance Monitoring Unit (PMU) registers
became part of "p9_host_os_sprs" structure and now this structure is
used to save/restore pmu host registers, for power9 and later platfroms.
But we still have old unused registers references. Patch removes unused
host_mmcr references for Monitor Mode Control Register 3 (MMCR3)/
Sampled Instruction Event Register 2 (SIER2)/ SIER3 registers from
"struct kvmppc_host_state".

Fixes: 57dc0eed73ca ("KVM: PPC: Book3S HV P9: Implement PMU save/restore in C")
Signed-off-by: Kajol Jain 
---
 arch/powerpc/include/asm/kvm_book3s_asm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index c8882d9b86c2..a36797938620 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -105,7 +105,7 @@ struct kvmppc_host_state {
void __iomem *xive_tima_virt;
u32 saved_xirr;
u64 dabr;
-   u64 host_mmcr[10];  /* MMCR 0,1,A, SIAR, SDAR, MMCR2, SIER, MMCR3, 
SIER2/3 */
+   u64 host_mmcr[7];   /* MMCR 0,1,A, SIAR, SDAR, MMCR2, SIER */
u32 host_pmc[8];
u64 host_purr;
u64 host_spurr;
-- 
2.31.1



Re: [PATCH V2] tools/perf/tests: Fix perf probe error log check in skip_if_no_debuginfo

2022-09-16 Thread Disha Goel


On 9/16/22 4:19 PM, Athira Rajeev wrote:

The perf probe related tests like probe_vfs_getname.sh which
is in "tools/perf/tests/shell" directory have dependency on
debuginfo information in the kernel. Currently debuginfo
check is handled by skip_if_no_debuginfo function in the
file "lib/probe_vfs_getname.sh". skip_if_no_debuginfo function
looks for this specific error log from perf probe to skip
the testcase:

<<>>
Failed to find the path for the kernel|Debuginfo-analysis is
not supported
<>>

But in some case, like this one in powerpc, while running this
test, observed error logs is:

<<>>
The /lib/modules//build/vmlinux file has no debug information.
Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo
package.
   Error: Failed to add events.
<<>>

Update the skip_if_no_debuginfo function to include the above
error, to skip the test in these scenarios too.

Reported-by: Disha Goel
Signed-off-by: Athira Rajeev


Tested-by: Disha Goel


---
changelog:
  v1 -> v2:
  Corrected formatting of spaces in error log.
  With spaces in v1 of the patch, the egrep search was
  considering spaces also.

  tools/perf/tests/shell/lib/probe_vfs_getname.sh | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/shell/lib/probe_vfs_getname.sh 
b/tools/perf/tests/shell/lib/probe_vfs_getname.sh
index 5b17d916c555..b616d42bd19d 100644
--- a/tools/perf/tests/shell/lib/probe_vfs_getname.sh
+++ b/tools/perf/tests/shell/lib/probe_vfs_getname.sh
@@ -19,6 +19,6 @@ add_probe_vfs_getname() {
  }

  skip_if_no_debuginfo() {
-   add_probe_vfs_getname -v 2>&1 | egrep -q "^(Failed to find the path for the 
kernel|Debuginfo-analysis is not supported)" && return 2
+   add_probe_vfs_getname -v 2>&1 | egrep -q "^(Failed to find the path for the 
kernel|Debuginfo-analysis is not supported)|(file has no debug information)" && return 2
return 1
  }

Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value

2022-09-16 Thread Disha Goel


On 9/13/22 5:27 PM, Athira Rajeev wrote:

perf stat includes option to specify aggr_mode to display
per-socket, per-core, per-die, per-node counter details.
Also there is option -A ( AGGR_NONE, -no-aggr ), where the
counter values are displayed for each cpu along with "CPU"
value in one field of the output.

Each of the aggregate mode uses the information fetched
from "/sys/devices/system/cpu/cpuX/topology" like core_id,
physical_package_id. Utility functions in "cpumap.c" fetches
this information and populates the socket id, core id, cpu etc.
If the platform does not expose the topology information,
these values will be set to -1. Example, in case of powerpc,
details like physical_package_id is restricted to be exposed
in pSeries platform. So id.socket, id.core, id.cpu all will
be set as -1.

In case of displaying socket or die value, there is no check
done in the "aggr_printout" function to see if it points to
valid socket id or die. But for displaying "cpu" value, there
is a check for "if (id.core > -1)". In case of powerpc pSeries
where detail like physical_package_id is restricted to be
exposed, id.core will be set to -1. Hence the column or field
itself for CPU won't be displayed in the output.

Result for per-socket:

<<>>
perf stat -e branches --per-socket -a true

  Performance counter stats for 'system wide':

S-1  32416,851  branches
<<>>

Here S has -1 in above result. But with -A option which also
expects CPU in one column in the result, below is observed.

<<>>
  /bin/perf stat -e instructions -A -a true

  Performance counter stats for 'system wide':

 47,146  instructions
 45,226  instructions
 43,354  instructions
 45,184  instructions
<<>>

If the cpu id value is pointing to -1 also, it makes sense
to display the column in the output to replicate the behaviour
or to be in precedence with other aggr options(like per-socket,
per-core). Remove the check "id.core" so that CPU field gets
displayed in the output.

After the fix:

<<>>
perf stat -e instructions -A -a true

  Performance counter stats for 'system wide':

CPU-1  64,034  instructions
CPU-1  68,941  instructions
CPU-1  59,418  instructions
CPU-1  70,478  instructions
CPU-1  65,201  instructions
CPU-1  63,704  instructions
<<>>

This is caught while running "perf test" for
"stat+json_output.sh" and "stat+csv_output.sh".

Reported-by: Disha Goel
Signed-off-by: Athira Rajeev


Tested-by: Disha Goel


---
  tools/perf/util/stat-display.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index b82844cb0ce7..1b751a730271 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
id.socket,
id.die,
id.core);
-   } else if (id.core > -1) {
+   } else
fprintf(config->output, "\"cpu\" : \"%d\", ",
id.cpu.cpu);
-   }
} else {
if (evsel->percore && !config->percore_show_thread) {
fprintf(config->output, "S%d-D%d-C%*d%s",
@@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
id.die,
config->csv_output ? 0 : -3,
id.core, config->csv_sep);
-   } else if (id.core > -1) {
+   } else
fprintf(config->output, "CPU%*d%s",
config->csv_output ? 0 : -7,
id.cpu.cpu, config->csv_sep);
-   }
}
break;
case AGGR_THREAD:

Re: [PATCH 1/3] powerpc/perf: Fix branch_filter support for multiple filters in powerpc

2022-09-16 Thread Disha Goel


On 9/13/22 5:25 PM, Athira Rajeev wrote:

For PERF_SAMPLE_BRANCH_STACK sample type, different branch_sample_type
ie branch filters are supported. The branch filters are requested via
event attribute "branch_sample_type". Multiple branch filters can be
passed in event attribute.

Example:
perf record -b -o- -B --branch-filter any,ind_call true

Powerpc does not support having multiple branch filters at
sametime. In powerpc, branch filters for branch stack sampling
is set via MMCRA IFM bits [32:33]. But currently when requesting
for multiple filter types, the "perf record" command does not
report any error.

Example:
perf record -b -o- -B --branch-filter any,save_type true
perf record -b -o- -B --branch-filter any,ind_call true

The "bhrb_filter_map" function in PMU driver code does the
validity check for supported branch filters. But this check
is done for single filter. Hence "perf record" will proceed
here without reporting any error.

Fix power_pmu_event_init to return EOPNOTSUPP when multiple
branch filters are requested in the event attr.

After the fix:
perf record --branch-filter any,ind_call -- ls
Error:
cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
Try 'perf stat'

Reported-by: Disha Goel
Signed-off-by: Athira Rajeev
Reviewed-by: Madhavan Srinivasan


Tested-by: Disha Goel


---
  arch/powerpc/perf/core-book3s.c | 15 +++
  1 file changed, 15 insertions(+)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 13919eb96931..2c2d235cb8d8 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2131,6 +2131,21 @@ static int power_pmu_event_init(struct perf_event *event)
if (has_branch_stack(event)) {
u64 bhrb_filter = -1;

+   /*
+* powerpc does not support having multiple branch filters
+* at sametime. Branch filters are set via MMCRA IFM[32:33]
+* bits for power8 and above. Return EOPNOTSUPP when multiple
+* branch filters are requested in the event attr.
+*
+* When opening event via perf_event_open, branch_sample_type
+* gets adjusted in perf_copy_attr function. Kernel will
+* automatically adjust the branch_sample_type based on the
+* event modifier settings to include 
PERF_SAMPLE_BRANCH_PLM_ALL.
+* Hence drop the check for PERF_SAMPLE_BRANCH_PLM_ALL.
+*/
+   if (hweight64(event->attr.branch_sample_type & 
~PERF_SAMPLE_BRANCH_PLM_ALL) > 1)
+   return -EOPNOTSUPP;
+
if (ppmu->bhrb_filter_map)
bhrb_filter = ppmu->bhrb_filter_map(
event->attr.branch_sample_type);

Re: [PATCH 1/2] tools/perf/tests/shell: Update stat+csv_output.sh to include sanity check for topology

2022-10-17 Thread Disha Goel



On 10/6/22 9:21 PM, Athira Rajeev wrote:

Testcase stat+csv_output.sh fails in powerpc:

84: perf stat CSV output linter: FAILED!

The testcase "stat+csv_output.sh" verifies perf stat
CSV output. The test covers aggregation modes like
per-socket, per-core, per-die, -A (no_aggr mode) along
with few other tests. It counts expected fields for
various commands. For example say -A (i.e, AGGR_NONE
mode), expects 7 fields in the output having "CPU" as
first field. Same way, for per-socket, it expects the
first field in result to point to socket id. The testcases
compares the result with expected count.

The values for socket, die, core and cpu are fetched
from topology directory:
/sys/devices/system/cpu/cpu*/topology.
For example, socket value is fetched from
"physical_package_id" file of topology directory.
(cpu__get_topology_int() in util/cpumap.c)

If a platform fails to fetch the topology information,
values will be set to -1. For example, incase of pSeries
platform of powerpc, value for  "physical_package_id" is
restricted and not exposed. So, -1 will be assigned.

Perf code has a checks for valid cpu id in "aggr_printout"
(stat-display.c), which displays the fields. So, in cases
where topology values not exposed, first field of the
output displaying will be empty. This cause the testcase
to fail, as it counts  number of fields in the output.

Incase of -A (AGGR_NONE mode,), testcase expects 7 fields
in the output, becos of -1 value obtained from topology
files for some, only 6 fields are printed. Hence a testcase
failure reported due to mismatch in number of fields in
the output.

Patch here adds a sanity check in the testcase for topology.
Check will help to skip the test if -1 value found.

Fixes: 7473ee56dbc9 ("perf test: Add checking for perf stat CSV output.")
Reported-by: Disha Goel 
Signed-off-by: Athira Rajeev 
Suggested-by: James Clark 
Suggested-by: Ian Rogers 


For the patchset,

Tested-by: Disha Goel 


---
  tools/perf/tests/shell/stat+csv_output.sh | 43 ---
  1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/tools/perf/tests/shell/stat+csv_output.sh 
b/tools/perf/tests/shell/stat+csv_output.sh
index eb5196f58190..b7f050aa6210 100755
--- a/tools/perf/tests/shell/stat+csv_output.sh
+++ b/tools/perf/tests/shell/stat+csv_output.sh
@@ -6,6 +6,8 @@

  set -e

+skip_test=0
+
  function commachecker()
  {
local -i cnt=0
@@ -156,14 +158,47 @@ check_per_socket()
echo "[Success]"
  }

+# The perf stat options for per-socket, per-core, per-die
+# and -A ( no_aggr mode ) uses the info fetched from this
+# directory: "/sys/devices/system/cpu/cpu*/topology". For
+# example, socket value is fetched from "physical_package_id"
+# file in topology directory.
+# Reference: cpu__get_topology_int in util/cpumap.c
+# If the platform doesn't expose topology information, values
+# will be set to -1. For example, incase of pSeries platform
+# of powerpc, value for  "physical_package_id" is restricted
+# and set to -1. Check here validates the socket-id read from
+# topology file before proceeding further
+
+FILE_LOC="/sys/devices/system/cpu/cpu*/topology/"
+FILE_NAME="physical_package_id"
+
+check_for_topology()
+{
+   if ! ParanoidAndNotRoot 0
+   then
+   socket_file=`ls $FILE_LOC/$FILE_NAME | head -n 1`
+   [ -z $socket_file ] && return 0
+   socket_id=`cat $socket_file`
+   [ $socket_id == -1 ] && skip_test=1
+   return 0
+   fi
+}
+
+check_for_topology
  check_no_args
  check_system_wide
-check_system_wide_no_aggr
  check_interval
  check_event
-check_per_core
  check_per_thread
-check_per_die
  check_per_node
-check_per_socket
+if [ $skip_test -ne 1 ]
+then
+   check_system_wide_no_aggr
+   check_per_core
+   check_per_die
+   check_per_socket
+else
+   echo "[Skip] Skipping tests for system_wide_no_aggr, per_core, per_die and 
per_socket since socket id exposed via topology is invalid"
+fi
  exit 0


Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events

2023-11-27 Thread Disha Goel

On 23/11/23 9:31 pm, Athira Rajeev wrote:


Running "perf list" on powerpc fails with segfault
as below:

./perf list
Segmentation fault (core dumped)

This happens because of duplicate events in the json list.
The powerpc Json event list contains some event with same
event name, but different event code. They are:
- PM_INST_FROM_L3MISS (Present in datasource and frontend)
- PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
- PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
- PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)

pmu_events_table__num_events uses the value from
table_pmu->num_entries which includes duplicate events as
well. This causes issue during "perf list" and results in
segmentation fault.

Since both event codes are valid, append _DSRC to the Data
Source events (datasource.json), so that they would have a
unique name. Also add PM_DATA_FROM_L2MISS_DSRC and
PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list
works as expected.

Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events")
Signed-off-by: Athira Rajeev 


I have tested the patch on Power10 machine. Perf list works correctly without 
any segfault now.

# ./perf list

List of pre-defined events (to be used in -e or -M):

  branch-instructions OR branches[Hardware event]
  branch-misses  [Hardware event]

Tested-by: Disha Goel 


---
  .../arch/powerpc/power10/datasource.json   | 18 ++
  1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json 
b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
index 6b0356f2d301..0eeaaf1a95b8 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
@@ -99,6 +99,11 @@
  "EventName": "PM_INST_FROM_L2MISS",
  "BriefDescription": "The processor's instruction cache was reloaded from a 
source beyond the local core's L2 due to a demand miss."
},
+  {
+"EventCode": "0x0003C000C040",
+"EventName": "PM_DATA_FROM_L2MISS_DSRC",
+"BriefDescription": "The processor's L1 data cache was reloaded from a source 
beyond the local core's L2 due to a demand miss."
+  },
{
  "EventCode": "0x00038010C040",
  "EventName": "PM_INST_FROM_L2MISS_ALL",
@@ -161,9 +166,14 @@
},
{
  "EventCode": "0x00078000C040",
-"EventName": "PM_INST_FROM_L3MISS",
+"EventName": "PM_INST_FROM_L3MISS_DSRC",
  "BriefDescription": "The processor's instruction cache was reloaded from 
beyond the local core's L3 due to a demand miss."
},
+  {
+"EventCode": "0x0007C000C040",
+"EventName": "PM_DATA_FROM_L3MISS_DSRC",
+"BriefDescription": "The processor's L1 data cache was reloaded from beyond the 
local core's L3 due to a demand miss."
+  },
{
  "EventCode": "0x00078010C040",
  "EventName": "PM_INST_FROM_L3MISS_ALL",
@@ -981,7 +991,7 @@
},
{
  "EventCode": "0x0003C000C142",
-"EventName": "PM_MRK_DATA_FROM_L2MISS",
+"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC",
  "BriefDescription": "The processor's L1 data cache was reloaded from a source 
beyond the local core's L2 due to a demand miss for a marked instruction."
},
{
@@ -1046,12 +1056,12 @@
},
{
  "EventCode": "0x00078000C142",
-"EventName": "PM_MRK_INST_FROM_L3MISS",
+"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC",
  "BriefDescription": "The processor's instruction cache was reloaded from 
beyond the local core's L3 due to a demand miss for a marked instruction."
},
{
  "EventCode": "0x0007C000C142",
-"EventName": "PM_MRK_DATA_FROM_L3MISS",
+"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC",
  "BriefDescription": "The processor's L1 data cache was reloaded from beyond 
the local core's L3 due to a demand miss for a marked instruction."
},
{


Re: [PATCH] perf test record+probe_libc_inet_pton: Fix call chain match on powerpc

2023-11-28 Thread Disha Goel

On 26/11/23 12:39 pm, Likhitha Korrapati wrote:


The perf test "probe libc's inet_pton & backtrace it with ping" fails on
powerpc as below:

root@xxx perf]# perf test -v "probe libc's inet_pton & backtrace it with
ping"
  85: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 96028
ping 96056 [002] 127271.101961: probe_libc:inet_pton: (7fffa1779a60)
7fffa1779a60 __GI___inet_pton+0x0
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
7fffa172a73c getaddrinfo+0x121c
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
FAIL: expected backtrace entry
"gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
got "7fffa172a73c getaddrinfo+0x121c
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)"
test child finished with -1
 end 
probe libc's inet_pton & backtrace it with ping: FAILED!

This test installs a probe on libc's inet_pton function, which will use
uprobes and then uses perf trace on a ping to localhost. It gets 3
levels deep backtrace and checks whether it is what we expected or not.

The test started failing from RHEL 9.4 where as it works in previous
distro version (RHEL 9.2). Test expects gaih_inet function to be part of
backtrace. But in the glibc version (2.34-86) which is part of distro
where it fails, this function is missing and hence the test is failing.

 From nm and ping command output we can confirm that gaih_inet function
is not present in the expected backtrace for glibc version glibc-2.34-86

[root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
001273e0 t gaih_inet_serv
001cd8d8 r gaih_inet_typeproto

[root@xxx perf]# perf script -i /tmp/perf.data.6E8
ping  104048 [000] 128582.508976: probe_libc:inet_pton: (7fff83779a60)
 7fff83779a60 __GI___inet_pton+0x0
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
 7fff8372a73c getaddrinfo+0x121c
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
11dc73534 [unknown] (/usr/bin/ping)
 7fff8362a8c4 __libc_start_call_main+0x84
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)

FAIL: expected backtrace entry
"gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
got "7fff9d52a73c getaddrinfo+0x121c
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)"

With version glibc-2.34-60 gaih_inet function is present as part of the
expected backtrace. So we cannot just remove the gaih_inet function from
the backtrace.

[root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
00130490 t gaih_inet.constprop.0
0012e830 t gaih_inet_serv
001d45e4 r gaih_inet_typeproto

[root@xxx perf]# ./perf script -i /tmp/perf.data.b6S
ping   67906 [000] 22699.591699: probe_libc:inet_pton_3: (7fffbdd80820)
 7fffbdd80820 __GI___inet_pton+0x0
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
 7fffbdd31160 gaih_inet.constprop.0+0xcd0
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
 7fffbdd31c7c getaddrinfo+0x14c
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
1140d3558 [unknown] (/usr/bin/ping)

This patch solves this issue by doing a conditional skip. If there is a
gaih_inet function present in the libc then it will be added to the
expected backtrace else the function will be skipped from being added
to the expected backtrace.

Output with the patch

[root@xxx perf]# ./perf test -v "probe libc's inet_pton & backtrace it
with ping"
  83: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 102662
ping 102692 [000] 127935.549973: probe_libc:inet_pton: (7fff93379a60)
7fff93379a60 __GI___inet_pton+0x0
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
7fff9332a73c getaddrinfo+0x121c
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
11ef03534 [unknown] (/usr/bin/ping)
test child finished with 0
 end 
probe libc's inet_pton & backtrace it with ping: Ok

Signed-off-by: Likhitha Korrapati 
Reported-by: Disha Goel 


Thanks for the fix patch.
I have tested on a Power10 machine, "probe libc's inet_pton & backtrace it with 
ping"
perf test passes with the patch applied.

Output where gaih_inet function is not present

# perf test -v "probe libc's inet_pton & backtrace it with ping"
 85: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 4622
ping 4652 [011] 58.987631: probe_libc:inet_pton: (7fff91b79a60)
7fff91b79a60 __GI___inet_pton+0x0 
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
7fff91b2a73c getaddrinfo+0x121c 
(/usr/lib64/glibc-hwcaps/power10/libc.so.6)
119e53534 [unknown] (/usr/bin/ping)
test child finished with 0
 end 
probe libc's inet_pton & backtrace it 

Re: [PATCH V3] tools/perf/tests: Fix string substitutions in build id test

2023-01-20 Thread Disha Goel

Environment with /bin/sh
# readlink -f $(which sh)
/bin/dash

Running perf test from tmp.perf/urgent

# ./perf test -v "build id cache operations"
 73: build id cache operations   :
--- start ---
test child forked, pid 71063
WARNING: wine not found. PE binaries will not be run.
test binaries: /tmp/perf.ex.SHA1.RNm /tmp/perf.ex.MD5.Flx 
./tests/shell/../pe-file.exe
DEBUGINFOD_URLS=
Adding 4abd406f041feb4f10ecde3fc30fd0639e1a91cb /tmp/perf.ex.SHA1.RNm: Ok
build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
./tests/shell/buildid.sh: 69: ./tests/shell/buildid.sh: Bad substitution
test child finished with -2
 end 

After applying the patch in same environment, error is not seen and test runs 
fine.

Tested the same patch in bash as well.
# readlink -f $(which sh)
/usr/bin/bash

The test works fine with the changes.

Tested-by: Disha Goel

On 1/19/23 7:57 PM, Athira Rajeev wrote:

The perf test named “build id cache operations” skips with below
error on some distros:

<<>>
  78: build id cache operations   :
test child forked, pid 01
WARNING: wine not found. PE binaries will not be run.
test binaries: /tmp/perf.ex.SHA1.PKz /tmp/perf.ex.MD5.Gt3 
./tests/shell/../pe-file.exe
DEBUGINFOD_URLS=
Adding 4abd406f041feb4f10ecde3fc30fd0639e1a91cb /tmp/perf.ex.SHA1.PKz: Ok
build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
./tests/shell/buildid.sh: 69: ./tests/shell/buildid.sh: Bad substitution
test child finished with -2
build id cache operations: Skip
<<>>

The test script "tests/shell/buildid.sh" uses some of the
string substitution ways which are supported in bash, but not in
"sh" or other shells. Above error on line number 69 that reports
"Bad substitution" is:

<<>>
link=${build_id_dir}/.build-id/${id:0:2}/${id:2}
<<>>

Here the way of getting first two characters from id ie,
${id:0:2} and similarly expressions like ${id:2} is not
recognised in "sh". So the line errors and instead of
hitting failure, the test gets skipped as shown in logs.
So the syntax issue causes test not to be executed in
such cases. Similarly usage : "${@: -1}" [ to pick last
argument passed to a function] in “test_record” doesn’t
work in all distros.

Fix this by using alternative way with shell substitution
to pick required characters from the string. Also fix the
usage of “${@: -1}” to work in all cases.

Another usage in “test_record” is:
<<>>
${perf} record --buildid-all -o ${data} $@ &> ${log}
<<>>

This causes the perf record to start in background and
Results in the data file not being created by the time
"check" function is invoked. Below log shows perf record
result getting displayed after the call to "check" function.

<<>>
running: perf record /tmp/perf.ex.SHA1.EAU
build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
link: /tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb
failed: link 
/tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb does 
not exist
test child finished with -1
build id cache operations: FAILED!
root@machine:~/athira/linux/tools/perf# Couldn't synthesize bpf events.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.010 MB /tmp/perf.data.bFF ]
<<>>

Fix this by redirecting output instead of using “&” which
starts the command in background.

Signed-off-by: Athira Rajeev
Acked-by: Ian Rogers
---
Changelog:
 From v2 -> v3
- Addressed review comments from David Laight
   for using shell substitutions.

 From v1 -> v2
- Added Acked-by from Ian.
- Rebased to tmp.perf/urgent of:
  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git

  tools/perf/tests/shell/buildid.sh | 18 +++---
  1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/tools/perf/tests/shell/buildid.sh 
b/tools/perf/tests/shell/buildid.sh
index aaf851108ca3..0ce22ea0a7f1 100755
--- a/tools/perf/tests/shell/buildid.sh
+++ b/tools/perf/tests/shell/buildid.sh
@@ -66,7 +66,9 @@ check()
esac
echo "build id: ${id}"
  
-	link=${build_id_dir}/.build-id/${id:0:2}/${id:2}

+   id_file=${id#??}
+   id_dir=${id%$id_file}
+   link=$build_id_dir/.build-id/$id_dir/$id_file
  	echo "link: ${link}" if [ ! -h $link ]; then @@ -74,7 +76,7 @@ check() exit 1 fi - 
file=${build_id_dir}/.build-id/${id:0:2}/`readlink ${link}`/elf + 
file=${build_id_dir}/.build-id/$id_dir/`readlink ${link}`/elf echo "file: ${file}"
  
  	# Check for file permission of original file

@@ -130,20 +132,22 @@ test_record()
  {
data=$(mktemp /tmp/perf.data.XXX)
build_id_dir=$(mktemp -d /tmp/perf.debug.XXX)
-   log=$(mktemp /tmp/perf.log.XXX)
+   log_out=$(mktemp /tmp/perf.log.out.XXX)
+   log_err=$(mktemp /tmp/perf.log.err.XXX)

Re: [PATCH] tools/perf/tests: Add system wide check for perf bench workload in all metric test

2023-02-06 Thread Disha Goel


On 2/2/23 10:14 PM, Kajol Jain wrote:

Testcase stat_all_metrics.sh fails in powerpc:
 
92: perf all metrics test : FAILED!


Logs with verbose:

[command]# ./perf test 92 -vv
  92: perf all metrics test   :
--- start ---
test child forked, pid 13262
Testing BRU_STALL_CPI
Testing COMPLETION_STALL_CPI

Testing TOTAL_LOCAL_NODE_PUMPS_P23
Metric 'TOTAL_LOCAL_NODE_PUMPS_P23' not printed in:
Error:
Invalid event (hv_24x7/PM_PB_LNS_PUMP23,chip=3/) in per-thread mode, enable 
system wide with '-a'.
Testing TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01
Metric 'TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01' not printed in:
Error:
Invalid event (hv_24x7/PM_PB_RTY_LNS_PUMP01,chip=3/) in per-thread mode, enable 
system wide with '-a'.


Based on above logs, we could see some of the hv-24x7 metric events fails,
and logs suggest to run the metric event with -a option.
This change happened after the commit a4b8cfcabb1d ("perf stat: Delay metric
parsing"), which delayed the metric parsing phase and now before metric parsing
phase perf tool identifies, whether target is system-wide or not. With this
change, perf_event_open will fails with workload monitoring for uncore events
as expected.

The perf all metric test case fails as some of the hv-24x7 metric events
may need bigger workload to get the data. And the added perf bench
workload in 'perf all metric test case' will not run for hv-24x7 without
-a option.

Fix this issue by adding system wide check for perf bench workload.

Result with the patch changes in powerpc:

92: perf all metrics test : Ok

Signed-off-by: Kajol Jain


Tested the patch on powerpc machine, perf metrics test works fine.

 91: perf all metrics test   : Ok

Tested-by: Disha Goel


---
  tools/perf/tests/shell/stat_all_metrics.sh | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/tools/perf/tests/shell/stat_all_metrics.sh 
b/tools/perf/tests/shell/stat_all_metrics.sh
index 6e79349e42be..d49832a316d9 100755
--- a/tools/perf/tests/shell/stat_all_metrics.sh
+++ b/tools/perf/tests/shell/stat_all_metrics.sh
@@ -23,6 +23,13 @@ for m in $(perf list --raw-dump metrics); do
then
  continue
fi
+  # Failed again, possibly the event is uncore pmu event which will need
+  # system wide monitoring with workload, so retry with -a option
+  result=$(perf stat -M "$m" -a perf bench internals synthesize 2>&1)
+  if [[ "$result" =~ "${m:0:50}" ]]
+  then
+continue
+  fi
echo "Metric '$m' not printed in:"
echo "$result"
if [[ "$err" != "1" ]]

Re: [PATCH] tests/bpf: Fix the bpf test to check for libtraceevent support

2023-02-06 Thread Disha Goel

On 1/31/23 7:20 PM, Athira Rajeev wrote:


"bpf" tests fails in environment with missing libtraceevent
support as below:

  # ./perf test 36
  36: BPF filter  :
  36.1: Basic BPF filtering   : FAILED!
  36.2: BPF pinning   : FAILED!
  36.3: BPF prologue generation   : FAILED!

The environment has clang but missing the libtraceevent
devel. Hence perf is compiled without libtraceevent support.

Detailed logs:
./perf test -v "Basic BPF filtering"

Failed to add BPF event syscalls:sys_enter_epoll_pwait
bpf: tracepoint call back failed, stop iterate
Failed to add events selected by BPF

The bpf tests tris to add probe event which fails
at "parse_events_add_tracepoint" function due to missing
libtraceevent. Add check for "HAVE_LIBTRACEEVENT" in the
"tests/bpf.c" before proceeding with the test.

With the change,

# ./perf test 36
36: BPF filter  :
36.1: Basic BPF filtering   : 
Skip (not compiled in or missing libtraceevent support)
36.2: BPF pinning   : 
Skip (not compiled in or missing libtraceevent support)
36.3: BPF prologue generation   : 
Skip (not compiled in or missing libtraceevent support)

Signed-off-by: Athira Rajeev


Tested the patch on powerpc, perf bpf test skips when libtraceevent-devel 
package is not installed.

 36: BPF filter  :
 36.1: Basic BPF filtering   : Skip 
(not compiled in or missing libtraceevent support)
 36.2: BPF pinning   : Skip 
(not compiled in or missing libtraceevent support)
 36.3: BPF prologue generation   : Skip 
(not compiled in or missing libtraceevent support)

Tested-by: Disha Goel


---
  tools/perf/tests/bpf.c | 22 +++---
  1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 17c023823713..4af39528f611 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -23,7 +23,7 @@
  #define NR_ITERS   111
  #define PERF_TEST_BPF_PATH "/sys/fs/bpf/perf_test"
  
-#ifdef HAVE_LIBBPF_SUPPORT

+#if defined(HAVE_LIBBPF_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
  #include 
  #include 
  
@@ -330,10 +330,10 @@ static int test__bpf(int i)

  static int test__basic_bpf_test(struct test_suite *test __maybe_unused,
int subtest __maybe_unused)
  {
-#ifdef HAVE_LIBBPF_SUPPORT
+#if defined(HAVE_LIBBPF_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
return test__bpf(0);
  #else
-   pr_debug("Skip BPF test because BPF support is not compiled\n");
+   pr_debug("Skip BPF test because BPF or libtraceevent support is not 
compiled\n");
return TEST_SKIP;
  #endif
  }
@@ -341,10 +341,10 @@ static int test__basic_bpf_test(struct test_suite *test 
__maybe_unused,
  static int test__bpf_pinning(struct test_suite *test __maybe_unused,
 int subtest __maybe_unused)
  {
-#ifdef HAVE_LIBBPF_SUPPORT
+#if defined(HAVE_LIBBPF_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
return test__bpf(1);
  #else
-   pr_debug("Skip BPF test because BPF support is not compiled\n");
+   pr_debug("Skip BPF test because BPF or libtraceevent support is not 
compiled\n");
return TEST_SKIP;
  #endif
  }
@@ -352,17 +352,17 @@ static int test__bpf_pinning(struct test_suite *test 
__maybe_unused,
  static int test__bpf_prologue_test(struct test_suite *test __maybe_unused,
   int subtest __maybe_unused)
  {
-#if defined(HAVE_LIBBPF_SUPPORT) && defined(HAVE_BPF_PROLOGUE)
+#if defined(HAVE_LIBBPF_SUPPORT) && defined(HAVE_BPF_PROLOGUE) && 
defined(HAVE_LIBTRACEEVENT)
return test__bpf(2);
  #else
-   pr_debug("Skip BPF test because BPF support is not compiled\n");
+   pr_debug("Skip BPF test because BPF or libtraceevent support is not 
compiled\n");
return TEST_SKIP;
  #endif
  }
  
  
  static struct test_case bpf_tests[] = {

-#ifdef HAVE_LIBBPF_SUPPORT
+#if defined(HAVE_LIBBPF_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
TEST_CASE("Basic BPF filtering", basic_bpf_test),
TEST_CASE_REASON("BPF pinning", bpf_pinning,
"clang isn't installed or environment missing BPF 
support"),
@@ -373,9 +373,9 @@ static struct test_case bpf_tests[] = {
TEST_CASE_RE

Re: [PATCH 1/2] tools/perf: Fix printing os->prefix in CSV metrics output

2022-11-04 Thread Disha Goel


On 11/4/22 12:59 PM, Athira Rajeev wrote:



On 03-Nov-2022, at 9:45 PM, Ian Rogers  wrote:

On Wed, Nov 2, 2022 at 1:36 AM Athira Rajeev
  wrote:




On 18-Oct-2022, at 2:26 PM, Athira Rajeev  wrote:

Perf stat with CSV output option prints an extra empty
string as first field in metrics output line.
Sample output below:

  # ./perf stat -x, --per-socket -a -C 1 ls
  S0,1,1.78,msec,cpu-clock,1785146,100.00,0.973,CPUs utilized
  S0,1,26,,context-switches,1781750,100.00,0.015,M/sec
  S0,1,1,,cpu-migrations,1780526,100.00,0.561,K/sec
  S0,1,1,,page-faults,1779060,100.00,0.561,K/sec
  S0,1,875807,,cycles,1769826,100.00,0.491,GHz
  S0,1,85281,,stalled-cycles-frontend,1767512,100.00,9.74,frontend cycles 
idle
  S0,1,576839,,stalled-cycles-backend,1766260,100.00,65.86,backend cycles 
idle
  S0,1,288430,,instructions,1762246,100.00,0.33,insn per cycle
> ,S0,1,,,2.00,stalled cycles per insn

The above command line uses field separator as ","
via "-x," option and per-socket option displays
socket value as first field. But here the last line
for "stalled cycles per insn" has "," in the
beginning.

Sample output using interval mode:
  # ./perf stat -I 1000 -x, --per-socket -a -C 1 ls
  0.001813453,S0,1,1.87,msec,cpu-clock,1872052,100.00,0.002,CPUs utilized
  0.001813453,S0,1,2,,context-switches,1868028,100.00,1.070,K/sec
  --
  0.001813453,S0,1,85379,,instructions,1856754,100.00,0.32,insn per cycle
> 0.001813453,,S0,1,,,1.34,stalled cycles per insn

Above result also has an extra csv separator after
the timestamp. Patch addresses extra field separator
in the beginning of the metric output line.

The counter stats are displayed by function
"perf_stat__print_shadow_stats" in code
"util/stat-shadow.c". While printing the stats info
for "stalled cycles per insn", function "new_line_csv"
is used as new_line callback.

The new_line_csv function has check for "os->prefix"
and if prefix is not null, it will be printed along
with cvs separator.
Snippet from "new_line_csv":
  if (os->prefix)
  fprintf(os->fh, "%s%s", os->prefix, config->csv_sep);

Here os->prefix gets printed followed by ","
which is the cvs separator. The os->prefix is
used in interval mode option ( -I ), to print
time stamp on every new line. But prefix is
already set to contain csv separator when used
in interval mode for csv option.

Reference: Function "static void print_interval"
Snippet:
  sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep);

Also if prefix is not assigned (if not used with
-I option), it gets set to empty string.
Reference: function printout() in util/stat-display.c
Snippet:
  .prefix = prefix ? prefix : "",

Since prefix already set to contain cvs_sep in interval
option, patch removes printing config->csv_sep in
new_line_csv function to avoid printing extra field.

After the patch:

  # ./perf stat -x, --per-socket -a -C 1 ls
  S0,1,2.04,msec,cpu-clock,2045202,100.00,1.013,CPUs utilized
  S0,1,2,,context-switches,2041444,100.00,979.289,/sec
  S0,1,0,,cpu-migrations,2040820,100.00,0.000,/sec
  S0,1,2,,page-faults,2040288,100.00,979.289,/sec
  S0,1,254589,,cycles,2036066,100.00,0.125,GHz
  S0,1,82481,,stalled-cycles-frontend,2032420,100.00,32.40,frontend cycles 
idle
  S0,1,113170,,stalled-cycles-backend,2031722,100.00,44.45,backend cycles 
idle
  S0,1,88766,,instructions,2030942,100.00,0.35,insn per cycle
  S0,1,,,1.27,stalled cycles per insn

Fixes: 92a61f6412d3 ("perf stat: Implement CSV metrics output")
Reported-by: Disha Goel
Signed-off-by: Athira Rajeev


perf stat csv test passed after applying the patch
Tested-by: Disha Goel


Hi All,

Looking for review comments for this change.

Hi,

Thanks for addressing issues in this code. What is the status of the
CSV output test following these changes?

I think going forward we need to move away from CSV and columns, to
something with structure like json. We also need to refactor this
code, trying to add meaning to a newline character is a bad strategy
and creates some unnatural things. To some extent this overlaps with
Namhyung's aggregation cleanup. There are also weirdnesses in
jevents/pmu-events, like the same ScaleUnit applying to a metric and
an event - why are metrics even parts of events?

Given the current code is wac-a-mole, and this is another whack, if
the testing is okay I think we should move forward with this change.

Thanks,
Ian

Hi Ian,

Thanks for checking the patch.
Yes, CSV output test passes with the change.

perf stat CSV output linter : Ok
perf stat csv summary test  : Ok

Thanks
Athira






Thanks
Athira


-

Re: [PATCH 0/4] tools/perf: Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-06 Thread Disha Goel


-Original Message-
From: Athira Rajeev 
To: a...@kernel.org, jo...@kernel.org, disg...@linux.vnet.ibm.com
Cc: m...@ellerman.id.au, linux-perf-us...@vger.kernel.org, 
linuxppc-dev@lists.ozlabs.org, ma...@linux.vnet.ibm.com, 
rnsas...@linux.ibm.com, kj...@linux.ibm.com
Subject: [PATCH 0/4] tools/perf: Fix perf bench numa, futex and epoll
to work with machines having #CPUs > 1K
Date: Sat,  2 Apr 2022 00:28:49 +0530

The perf benchmark for collections: numa, futex and epollhits failure
in system configuration with CPU's more than 1024.These benchmarks uses
"sched_getaffinity" and "sched_setaffinity"in the code to work with
affinity.
Example snippet from numa benchmark:<<>>perf: bench/numa.c:302:
bind_to_node: Assertion `!(ret)' failed.Aborted (core dumped)<<>>
bind_to_node function uses "sched_getaffinity" to save the cpumask.This
fails with EINVAL because the default mask size in glibc is 1024.
Similarly in futex and epoll benchmark, uses sched_setaffinity
duringpthread_create with affinity. And since it returns EINVAL in such
systemconfiguration, benchmark doesn't run.
To overcome this 1024 CPUs mask size limitation of cpu_set_t,change the
mask size using the CPU_*_S macros ie, use CPU_ALLOC toallocate
cpumask, CPU_ALLOC_SIZE for size, CPU_SET_S to set mask bit.
Fix all the relevant places in the code to use mask size which is
largeenough to represent number of possible CPU's in the system.
Fix parse_setup_cpu_list function in numa bench to check if input CPUis
online before binding task to that CPU. This is to fix failures
where,though CPU number is within max CPU, it could happen that CPU is
offline.Here, sched_setaffinity will result in failure when using
cpumask havingthat cpu bit set in the mask.
Patch 1 and Patch 2 address fix for perf bench futex and perf
benchepoll benchmark. Patch 3 and Patch 4 address fix in perf bench
numabenchmark
Athira Rajeev (4):  tools/perf: Fix perf bench futex to correct usage
of affinity formachines with #CPUs > 1K  tools/perf: Fix perf bench
epoll to correct usage of affinity formachines with #CPUs >
1K  tools/perf: Fix perf numa bench to fix usage of affinity for
machineswith #CPUs > 1K  tools/perf: Fix perf bench numa testcase
to check if CPU used to bindtask is online
Tesed the patches on powerpc with CPU > 1K and other configurations as
well, verified the perf benchmark numa, futex and epoll for whole
series with the patch set.Tested-by: Disha Goel <
disg...@linux.vnet.ibm.com>
 tools/perf/bench/epoll-ctl.c   |  25 --
tools/perf/bench/epoll-wait.c  |  25 --
tools/perf/bench/futex-hash.c  |  26 --
tools/perf/bench/futex-lock-pi.c   |  21 +++--
tools/perf/bench/futex-requeue.c   |  21 +++--
tools/perf/bench/futex-wake-parallel.c |  21 +++--
tools/perf/bench/futex-wake.c  |  22 +++--
tools/perf/bench/numa.c| 117 ++---
tools/perf/util/header.c   |  43 +
tools/perf/util/header.h   |   1 + 10 files changed, 252
insertions(+), 70 deletions(-)



Re: [PATCH V3 0/2] Fix perf bench numa to work with machines having #CPUs > 1K

2022-04-13 Thread Disha Goel


-Original Message-
From: Athira Rajeev 
To: a...@kernel.org, jo...@kernel.org, disg...@linux.vnet.ibm.com
Cc: m...@ellerman.id.au, linux-perf-us...@vger.kernel.org, 
linuxppc-dev@lists.ozlabs.org, ma...@linux.vnet.ibm.com, 
rnsas...@linux.ibm.com, kj...@linux.ibm.com, 
linux-ker...@vger.kernel.org, sri...@linux.vnet.ibm.com, 
irog...@google.com
Subject: [PATCH V3 0/2] Fix perf bench numa to work with machines
having #CPUs > 1K
Date: Tue, 12 Apr 2022 22:10:57 +0530

The perf benchmark for collections: numa hits failure in
systemconfiguration with CPU's more than 1024. These benchmarks
uses"sched_getaffinity" and "sched_setaffinity" in the code towork with
affinity.
Example snippet from numa benchmark:<<>>perf: bench/numa.c:302:
bind_to_node: Assertion `!(ret)' failed.Aborted (core dumped)<<>>
bind_to_node function uses "sched_getaffinity" to save the cpumask.This
fails with EINVAL because the default mask size in glibc is 1024
To overcome this 1024 CPUs mask size limitation of cpu_set_t,change the
mask size using the CPU_*_S macros ie, use CPU_ALLOC toallocate
cpumask, CPU_ALLOC_SIZE for size, CPU_SET_S to set mask bit.
Fix all the relevant places in the code to use mask size which is
largeenough to represent number of possible CPU's in the system.
This patchset also address a fix for parse_setup_cpu_list function
innuma bench to check if input CPU is online before binding task tothat
CPU. This is to fix failures where, though CPU number is withinmax CPU,
it could happen that CPU is offline. Here, sched_setaffinitywill result
in failure when using cpumask having that cpu bit setin the mask.
Patch 1 address fix for parse_setup_cpu_list to check if CPU used to
bindtask is online. Patch 2 has fix for bench numa to work with
machineshaving #CPUs > 1K
Athira Rajeev (2):  tools/perf: Fix perf bench numa testcase to check
if CPU used to bindtask is online  perf bench: Fix numa bench to
fix usage of affinity for machines with#CPUs > 1K
Changelog:v2 -> v3Link to the v2 version :
https://lore.kernel.org/all/20220406175113.87881-1-atraj...@linux.vnet.ibm.com/
 - From the v2 version, patch 1 and patch 2 are now part of upstream. -
This v3 version separates patch 3 and patch 4 to address
review   comments from arnaldo which includes using sysfs__read_str for
reading   sysfs file and fixing the compilation issues observed in
debian
Tesed the patches on powerpc with CPU > 1K and other configurations as
well, verified the perf bench numa with the patch set.Tested-by: Disha
Goel 
 tools/perf/bench/numa.c  | 136 +--
tools/perf/util/header.c |  51 +++ tools/perf/util/header.h
|   1 + 3 files changed, 153 insertions(+), 35 deletions(-)



Re: [PATCH 0/2] Fix session topology test for powerpc and add utility function to get cpuinfo entries

2022-04-29 Thread Disha Goel


-Original Message-
From: Athira Rajeev 
To: a...@kernel.org, jo...@kernel.org, disg...@linux.vnet.ibm.com
Cc: m...@ellerman.id.au, linux-perf-us...@vger.kernel.org, 
linuxppc-dev@lists.ozlabs.org, ma...@linux.vnet.ibm.com, 
rnsas...@linux.ibm.com, kj...@linux.ibm.com, irog...@google.com
Subject: [PATCH 0/2] Fix session topology test for powerpc and add
utility function to get cpuinfo entries
Date: Thu, 28 Apr 2022 20:38:27 +0530

The session topology test fails in powerpc pSeries platform.Test
logs:<<>>Session topology : FAILED!<<>>
This test uses cpu topology information and in powerpc,some of the
topology info is restricted in environmentlike virtualized platform.
Hence this test needs to beskipped in pSeries platform for powerpc. The
informationabout platform is available in /proc/cpuinfo.
Patch 1 adds generic utility function in "util/header.c"to read
/proc/cpuinfo for any entry. Though the testcasefix needs value from
"platform" entry, making this as ageneric function to return value for
any entry from the/proc/cpuinfo file which can be used commonly in
futureusecases.
Patch 2 uses the newly added utility function to look forplatform and
skip the test in pSeries platform for powerpc.
Athira Rajeev (2):  tools/perf: Add utility function to read
/proc/cpuinfo for any field  tools/perf/tests: Fix session topology
test to skip the test in guestenvironment
Tested the patches on powerpc and powernv, verified perf test session
topology test with the patch set.Tested-by: Disha Goel <
disg...@linux.vnet.ibm.com>
 tools/perf/tests/topology.c | 17 
tools/perf/util/header.c| 54 +
tools/perf/util/header.h|  1 + 3 files changed, 72 insertions(+)



Re: [PATCH V2 0/2] Fix session topology test for powerpc and add utility function to get cpuinfo entries

2022-05-05 Thread Disha Goel


-Original Message-
From: Athira Rajeev 
To: a...@kernel.org, jo...@kernel.org, disg...@linux.vnet.ibm.com
Cc: m...@ellerman.id.au, linux-perf-us...@vger.kernel.org, 
linuxppc-dev@lists.ozlabs.org, ma...@linux.vnet.ibm.com, 
rnsas...@linux.ibm.com, kj...@linux.ibm.com, irog...@google.com
Subject: [PATCH V2 0/2]  Fix session topology test for powerpc and add
utility function to get cpuinfo entries
Date: Thu,  5 May 2022 15:09:58 +0530

The session topology test fails in powerpc pSeries platform.Test
logs:<<>>Session topology : FAILED!<<>>
This test uses cpu topology information and in powerpc,some of the
topology info is restricted in environmentlike virtualized platform.
Hence this test needs to beskipped in pSeries platform for powerpc. The
informationabout platform is available in /proc/cpuinfo.
Patch 1 adds generic utility function in "util/header.c"to read
/proc/cpuinfo for any entry. Though the testcasefix needs value from
"platform" entry, making this as ageneric function to return value for
any entry from the/proc/cpuinfo file which can be used commonly in
futureusecases.
Patch 2 uses the newly added utility function to look forplatform and
skip the test in pSeries platform for powerpc.
Athira Rajeev (2):  tools/perf: Add utility function to read
/proc/cpuinfo for any field  tools/perf/tests: Fix session topology
test to skip the test in guestenvironment
Changelog: V1 -> v2: Addressed review comments from Kajol. Use "strim"
to remove space from string. Also use "feof" to check for EOF instead
of using new variable "ret".
Tested the patches on powervm and powernv, verified perf test session
topology test with the patch set.Tested-by: Disha Goel <
disg...@linux.vnet.ibm.com>
 tools/perf/tests/topology.c | 17 
tools/perf/util/header.c| 53 +
tools/perf/util/header.h|  1 + 3 files changed, 71 insertions(+)


Re: [PATCH V2] tools/perf/tests: Skip perf BPF test if clang is not present

2022-05-13 Thread Disha Goel


-Original Message-
From: Athira Rajeev 
To: a...@kernel.org, jo...@kernel.org
Cc: m...@ellerman.id.au, linux-perf-us...@vger.kernel.org, 
linuxppc-dev@lists.ozlabs.org, ma...@linux.vnet.ibm.com, 
rnsas...@linux.ibm.com, kj...@linux.ibm.com, disg...@linux.vnet.ibm.com
, irog...@google.com
Subject: [PATCH V2] tools/perf/tests: Skip perf BPF test if clang is
not present
Date: Wed, 11 May 2022 17:24:38 +0530

Perf BPF filter test fails in environment where "clang"is not
installed.
Test failure logs:
<<>> 42: BPF filter: 42.1: Basic BPF
filtering : Skip 42.2: BPF pinning : FAILED!
42.3: BPF prologue generation : FAILED!<<>>
Enabling verbose option provided debug logs which saysclang/llvm needs
to be installed. Snippet of verbose logs:
<<>> 42.2: BPF pinning  : --- start ---test child
forked, pid 61423ERROR: unable to find clang.Hint:  Try to install
latest clang/llvm to support BPF.Check your $PATH
<>
Failed to compile test case: 'Basic BPF llvm compile'Unable to get BPF
object, fix kbuild firsttest child finished with -1  end BPF
filter subtest 2: FAILED!<<>>
Here subtests, "BPF pinning" and "BPF prologue generation"failed and
logs shows clang/llvm is needed. After installingclang, testcase
passes.
Reason on why subtest failure happens though logs has properdebug
information:Main function __test__bpf calls test_llvm__fetch_bpf_obj
bypassing 4th argument as true ( 4th arguments maps to parameter"force"
in test_llvm__fetch_bpf_obj ). But this will
causetest_llvm__fetch_bpf_obj to skip the check for clang/llvm.
Snippet of code part which checks for clang based onparameter "force"
in test_llvm__fetch_bpf_obj:
<<>>if (!force && (!llvm_param.user_set_param &&<<>>
Since force is set to "false", test won't get skipped andfails to
compile test case. The BPF code compilation needsclang, So pass the
fourth argument as "false" and also skipthe test if reason for return
is "TEST_SKIP"
After the patch:
<<>> 42: BPF filter: 42.1: Basic BPF
filtering : Skip 42.2: BPF pinning : Skip 42.3:
BPF prologue generation : Skip<<>>
Fixes: ba1fae431e74 ("perf test: Add 'perf test BPF'")Signed-off-by:
Athira Rajeev ---Changelog: Addressed
review comments from Arnaldo by adding reason for skipping of testcase.
 tools/perf/tests/bpf.c | 10 ++ 1 file changed, 6
insertions(+), 4 deletions(-)
Tested the patch on powerpc and powernv, perf BPF test works fine with
the patch.Tested-by: Disha Goel 
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.cindex
57b9591f7cbb..17c023823713 100644--- a/tools/perf/tests/bpf.c+++
b/tools/perf/tests/bpf.c@@ -222,11 +222,11 @@ static int
__test__bpf(int idx)
ret = test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz,   
   bpf_testcase_table[idx].prog_id,-
   true, NULL);+   
false, NULL);   if (ret != TEST_OK || !obj_buf || !obj_buf_sz) {
pr_debug("Unable to get BPF object, %s\n",  
 bpf_testcase_table[idx].msg_compile_fail);-if (idx == 0)+  
if ((idx == 0) || (ret == TEST_SKIP))   return
TEST_SKIP;  elsereturn TEST_FAIL;@@
-364,9 +364,11 @@ static int test__bpf_prologue_test(struct test_suite
*test __maybe_unused, static struct test_case bpf_tests[] = { #ifdef
HAVE_LIBBPF_SUPPORT TEST_CASE("Basic BPF filtering",
basic_bpf_test),-   TEST_CASE("BPF pinning", bpf_pinning),+ TEST_CA
SE_REASON("BPF pinning", bpf_pinning,+  "clang isn't
installed or environment missing BPF support"), #ifdef
HAVE_BPF_PROLOGUE-  TEST_CASE("BPF prologue generation",
bpf_prologue_test),+TEST_CASE_REASON("BPF prologue generation",
bpf_prologue_test,+ "clang isn't installed or
environment missing BPF support"), #elseTEST_CASE_REASON("BPF
prologue generation", bpf_prologue_test, "not compiled in"), #endif


Re: [PATCH V2] tools/perf/tests: Fix session topology test to skip the test in guest environment

2022-05-13 Thread Disha Goel


-Original Message-
From: Athira Rajeev 
To: a...@kernel.org, jo...@kernel.org
Cc: m...@ellerman.id.au, linux-perf-us...@vger.kernel.org, 
linuxppc-dev@lists.ozlabs.org, ma...@linux.vnet.ibm.com, 
rnsas...@linux.ibm.com, kj...@linux.ibm.com, disg...@linux.vnet.ibm.com
, irog...@google.com
Subject: [PATCH V2] tools/perf/tests: Fix session topology test to skip
the test in guest environment
Date: Wed, 11 May 2022 17:19:59 +0530

The session topology test fails in powerpc pSeries platform.Test
logs:<<>>Session topology : FAILED!<<>>
This testcases tests cpu topology by checking the core_id andsocket_id
stored in perf_env from perf session. The data fromperf session is
compared with the cpu topology informationfrom
"/sys/devices/system/cpu/cpuX/topology" like
core_id,physical_package_id. In case of virtual environment, detaillike
physical_package_id is restricted to be exposed.
Hencephysical_package_id is set to -1. The testcase fails on
suchplatforms since socket_id can't be fetched from topology info.
Skip the testcase in powerpc if physical_package_id returns -1
Signed-off-by: Athira Rajeev ---
Changelog:v1 -> v2: Addressed review comments from Arnaldo and Michael
Ellerman. skip the test in powerpc when physical_package_id is set to
-1. Dropped patch 1 from V1 since current change doesn't use info from
/proc/cpuinfo and rather uses physical_package_id value.
 tools/perf/tests/topology.c | 11 +++ 1 file changed, 11
insertions(+)
Tested the patch on powerpc and powernv, session topology test works
fine with the patch.Tested-by: Disha Goel 
diff --git a/tools/perf/tests/topology.c
b/tools/perf/tests/topology.cindex ee1e3dcbc0bd..d23a9e322ff5 100644---
a/tools/perf/tests/topology.c+++ b/tools/perf/tests/topology.c@@ -109,6
+109,17 @@ static int check_cpu_topology(char *path, struct
perf_cpu_map *map)  && strncmp(session-
>header.env.arch, "aarch64", 7))return TEST_SKIP;
+   /*+  * In powerpc pSeries platform, not all the topology
information+ * are exposed via sysfs. Due to restriction, detail
like+* physical_package_id will be set to -1. Hence skip this+  
 * test if physical_package_id returns -1 for cpu from perf_cpu_map.+   
 */+if (strncmp(session->header.env.arch, "powerpc", 7)) {+ 
if (cpu__get_socket_id(perf_cpu_map__cpu(map, 0)) == -1)+   
return TEST_SKIP;+  }+  TEST_ASSERT_VAL("Session header
CPU map not set", session->header.env.cpu);
for (i = 0; i < session->header.env.nr_cpus_avail; i++) {


Re: [PATCH 1/3] tools/perf: Fix the nrcpus in perf bench futex to enable the run when all CPU's are not online

2024-06-10 Thread Disha Goel

On 07/06/24 10:13 am, Athira Rajeev wrote:


Perf bench futex fails as below when attempted to run on
on a powerpc system:

  ./perf bench futex all
  Running futex/hash benchmark...
Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes 
for 10 secs.

perf: pthread_create: No such file or directory

In the setup where this perf bench was ran, difference was that
partition had 640 CPU's, but not all CPUs were online. 80 CPUs
were online. While blocking the threads with futex_wait, code
sets the affinity using cpumask. The cpumask size used is 80
which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
benchmark reports fail while setting affinity for cpu number which
is greater than 80 or higher, because it attempts to set a bit
position which is not allocated on the cpumask. Fix this by changing
the size of cpumask to number of possible cpus and not the number
of online cpus.

Signed-off-by: Athira Rajeev 


Thanks for the fix patches, Athira.
I have tested all three patches on a power machine (both small and max config),
and the perf bench futex and epoll tests run fine.

For the series:
Tested-by: Disha Goel 


---
  tools/perf/bench/futex-hash.c  | 2 +-
  tools/perf/bench/futex-lock-pi.c   | 2 +-
  tools/perf/bench/futex-requeue.c   | 2 +-
  tools/perf/bench/futex-wake-parallel.c | 2 +-
  tools/perf/bench/futex-wake.c  | 2 +-
  5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index 0c69d20efa32..b472eded521b 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -174,7 +174,7 @@ int bench_futex_hash(int argc, const char **argv)
pthread_attr_init(&thread_attr);
gettimeofday(&bench__start, NULL);
  
-	nrcpus = perf_cpu_map__nr(cpu);

+   nrcpus = cpu__max_cpu().cpu;
cpuset = CPU_ALLOC(nrcpus);
BUG_ON(!cpuset);
size = CPU_ALLOC_SIZE(nrcpus);
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index 7a4973346180..0416120c091b 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -122,7 +122,7 @@ static void create_threads(struct worker *w, struct 
perf_cpu_map *cpu)
  {
cpu_set_t *cpuset;
unsigned int i;
-   int nrcpus =  perf_cpu_map__nr(cpu);
+   int nrcpus =  cpu__max_cpu().cpu;
size_t size;
  
  	threads_starting = params.nthreads;

diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
index d9ad736c1a3e..aad5bfc4fe18 100644
--- a/tools/perf/bench/futex-requeue.c
+++ b/tools/perf/bench/futex-requeue.c
@@ -125,7 +125,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map 
*cpu)
  {
cpu_set_t *cpuset;
unsigned int i;
-   int nrcpus = perf_cpu_map__nr(cpu);
+   int nrcpus = cpu__max_cpu().cpu;
size_t size;
  
  	threads_starting = params.nthreads;

diff --git a/tools/perf/bench/futex-wake-parallel.c 
b/tools/perf/bench/futex-wake-parallel.c
index b66df553e561..90a5b91bf139 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -149,7 +149,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map 
*cpu)
  {
cpu_set_t *cpuset;
unsigned int i;
-   int nrcpus = perf_cpu_map__nr(cpu);
+   int nrcpus = cpu__max_cpu().cpu;
size_t size;
  
  	threads_starting = params.nthreads;

diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
index 690fd6d3da13..49b3c89b0b35 100644
--- a/tools/perf/bench/futex-wake.c
+++ b/tools/perf/bench/futex-wake.c
@@ -100,7 +100,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map 
*cpu)
cpu_set_t *cpuset;
unsigned int i;
size_t size;
-   int nrcpus = perf_cpu_map__nr(cpu);
+   int nrcpus = cpu__max_cpu().cpu;
threads_starting = params.nthreads;
  
  	cpuset = CPU_ALLOC(nrcpus);


Re: [PATCH] perf vendor events power10: Update JSON/events

2024-07-23 Thread Disha Goel

On 23/07/24 10:51 am, Kajol Jain wrote:


Update JSON/events for power10 platform with additional events.
Also move PM_VECTOR_LD_CMPL event from others.json to
frontend.json file.

Signed-off-by: Kajol Jain 


I have tested the patch on power10 machine. Looks good to me.

Tested-by: Disha Goel 


---
  .../arch/powerpc/power10/frontend.json|   5 +
  .../arch/powerpc/power10/others.json  | 100 +-
  2 files changed, 100 insertions(+), 5 deletions(-)

diff --git a/tools/perf/pmu-events/arch/powerpc/power10/frontend.json 
b/tools/perf/pmu-events/arch/powerpc/power10/frontend.json
index 5977f5e64212..53660c279286 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/frontend.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/frontend.json
@@ -74,6 +74,11 @@
  "EventName": "PM_ISSUE_KILL",
  "BriefDescription": "Cycles in which an instruction or group of instructions 
were cancelled after being issued. This event increments once per occurrence, regardless of how 
many instructions are included in the issue group."
},
+  {
+"EventCode": "0x44054",
+"EventName": "PM_VECTOR_LD_CMPL",
+"BriefDescription": "Vector load instruction completed."
+  },
{
  "EventCode": "0x44056",
  "EventName": "PM_VECTOR_ST_CMPL",
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/others.json 
b/tools/perf/pmu-events/arch/powerpc/power10/others.json
index fcf8a8ebe7bd..53ca610152fa 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/others.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/others.json
@@ -94,11 +94,6 @@
  "EventName": "PM_L1_ICACHE_RELOADED_ALL",
  "BriefDescription": "Counts all instruction cache reloads includes demand, 
prefetch, prefetch turned into demand and demand turned into prefetch."
},
-  {
-"EventCode": "0x44054",
-"EventName": "PM_VECTOR_LD_CMPL",
-"BriefDescription": "Vector load instruction completed."
-  },
{
  "EventCode": "0x4D05E",
  "EventName": "PM_BR_CMPL",
@@ -108,5 +103,100 @@
  "EventCode": "0x400F0",
  "EventName": "PM_LD_DEMAND_MISS_L1_FIN",
  "BriefDescription": "Load missed L1, counted at finish time."
+  },
+  {
+"EventCode": "0x0038BC",
+"EventName": "PM_ISYNC_CMPL",
+"BriefDescription": "Isync completion count per thread."
+  },
+  {
+"EventCode": "0x00C088",
+"EventName": "PM_LD0_32B_FIN",
+"BriefDescription": "256-bit load finished in the LD0 load execution unit."
+  },
+  {
+"EventCode": "0x00C888",
+"EventName": "PM_LD1_32B_FIN",
+"BriefDescription": "256-bit load finished in the LD1 load execution unit."
+  },
+  {
+"EventCode": "0x00C090",
+"EventName": "PM_LD0_UNALIGNED_FIN",
+"BriefDescription": "Load instructions in LD0 port that are either unaligned, 
or treated as unaligned and require an additional recycle through the pipeline using the load 
gather buffer. This typically adds about 10 cycles to the latency of the instruction. This includes 
loads that cross the 128 byte boundary, octword loads that are not aligned, and a special forward 
progress case of a load that does not hit in the L1 and crosses the 32 byte boundary and is 
launched NTC. Counted at finish time."
+  },
+  {
+"EventCode": "0x00C890",
+"EventName": "PM_LD1_UNALIGNED_FIN",
+"BriefDescription": "Load instructions in LD1 port that are either unaligned, 
or treated as unaligned and require an additional recycle through the pipeline using the load 
gather buffer. This typically adds about 10 cycles to the latency of the instruction. This includes 
loads that cross the 128 byte boundary, octword loads that are not aligned, and a special forward 
progress case of a load that does not hit in the L1 and crosses the 32 byte boundary and is 
launched NTC. Counted at finish time."
+  },
+  {
+"EventCode": "0x00C0A4",
+"EventName": "PM_ST0_UNALIGNED_FIN",
+"BriefDescription": "Store instructions in ST0 port that are either unaligned, 
or treated as unaligned and require an additional recycle through the pipeline. This typically adds 
about 10 cycles to the latency of the instruction. This only includes stores that cross the 128 
byte boundary. Counted at finish time."
+  },
+  {
+"EventCode

Re: [PATCH V2 1/2] tools/perf/pmu-events/powerpc: Add support for compat events in json

2024-10-17 Thread Disha Goel

On 10/10/24 8:21 pm, Athira Rajeev wrote:


perf list picks the events supported for specific platform
from pmu-events/arch/powerpc/. Example power10 events
are in pmu-events/arch/powerpc/power10, power9 events are part
of pmu-events/arch/powerpc/power9. The decision of which
platform to pick is determined based on PVR value in powerpc.
The PVR value is matched from pmu-events/arch/powerpc/mapfile.csv

Example:

Format:
 PVR,Version,JSON/file/pathname,Type

0x004[bcd][[:xdigit:]]{4},1,power8,core
0x0066[[:xdigit:]]{4},1,power8,core
0x004e[[:xdigit:]]{4},1,power9,core
0x0080[[:xdigit:]]{4},1,power10,core
0x0082[[:xdigit:]]{4},1,power10,core

The code gets the PVR from system using get_cpuid_str function
in arch/powerpc/util/headers.c ( from SPRN_PVR ) and compares
with value from mapfile.csv

In case of compat mode, say when partition is booted in a power9
mode when the system is a power10, add an entry to pick the
ISA architected events from "pmu-events/arch/powerpc/compat".
Add json file generic-events.json which will contain these
events which is supported in compat mode.

Suggested-by: Madhavan Srinivasan 
Signed-off-by: Athira Rajeev 


I have tested the patch on PowerPC by booting in different compat modes and
at base also, and it's working as expected.

For the series:
Tested-by: Disha Goel


---

  .../arch/powerpc/compat/generic-events.json   | 117 ++
  .../perf/pmu-events/arch/powerpc/mapfile.csv  |   1 +
  2 files changed, 118 insertions(+)
  create mode 100644 
tools/perf/pmu-events/arch/powerpc/compat/generic-events.json

diff --git a/tools/perf/pmu-events/arch/powerpc/compat/generic-events.json 
b/tools/perf/pmu-events/arch/powerpc/compat/generic-events.json
new file mode 100644
index ..6f5e8efcb098
--- /dev/null
+++ b/tools/perf/pmu-events/arch/powerpc/compat/generic-events.json
@@ -0,0 +1,117 @@
+[
+  {
+"EventCode": "0x600F4",
+"EventName": "PM_CYC",
+"BriefDescription": "Processor cycles."
+  },
+  {
+"EventCode": "0x100F2",
+"EventName": "PM_CYC_INST_CMPL",
+"BriefDescription": "1 or more ppc insts finished"
+  },
+  {
+"EventCode": "0x100f4",
+"EventName": "PM_FLOP_CMPL",
+"BriefDescription": "Floating Point Operations Finished."
+  },
+  {
+"EventCode": "0x100F6",
+"EventName": "PM_L1_ITLB_MISS",
+"BriefDescription": "Number of I-ERAT reloads."
+  },
+  {
+"EventCode": "0x100F8",
+"EventName": "PM_NO_INST_AVAIL",
+"BriefDescription": "Number of cycles the ICT has no itags assigned to this 
thread."
+  },
+  {
+"EventCode": "0x100fc",
+"EventName": "PM_LD_CMPL",
+"BriefDescription": "Load instruction completed."
+  },
+  {
+"EventCode": "0x200F0",
+"EventName": "PM_ST_CMPL",
+"BriefDescription": "Stores completed from S2Q (2nd-level store queue)."
+  },
+  {
+"EventCode": "0x200F2",
+"EventName": "PM_INST_DISP",
+"BriefDescription": "PowerPC instruction dispatched."
+  },
+  {
+"EventCode": "0x200F4",
+"EventName": "PM_RUN_CYC",
+"BriefDescription": "Processor cycles gated by the run latch."
+  },
+  {
+"EventCode": "0x200F6",
+"EventName": "PM_L1_DTLB_RELOAD",
+"BriefDescription": "DERAT Reloaded due to a DERAT miss."
+  },
+  {
+"EventCode": "0x200FA",
+"EventName": "PM_BR_TAKEN_CMPL",
+"BriefDescription": "Branch Taken instruction completed."
+  },
+  {
+"EventCode": "0x200FC",
+"EventName": "PM_L1_ICACHE_MISS",
+"BriefDescription": "Demand instruction cache miss."
+  },
+  {
+"EventCode": "0x200FE",
+"EventName": "PM_L1_RELOAD_FROM_MEM",
+"BriefDescription": "L1 Dcache reload from memory"
+  },
+  {
+"EventCode": "0x300F0",
+"EventName": "PM_ST_MISS_L1",
+"BriefDescription": "Store Missed L1"
+  },
+  {
+"EventCode": "0x300FC",
+"EventName": "PM_DTLB_MISS",
+"BriefDescription": "Data PTEG reload"
+  },
+  {
+"EventCode": "0x300FE",
+"EventName": "PM_DATA_FROM_L3MISS",
+"BriefDescription": "Demand LD - L3 Mi

Re: [PATCH V2 2/5] tools/testing/selftests/powerpc: Add check for power11 pvr for pmu selfests

2025-01-15 Thread Disha Goel

On 13/01/25 1:28 pm, Athira Rajeev wrote:

Some of the tests depends on pvr value to choose
the event. Example:
- event_alternatives_tests_p10: alternative event depends
  on registered PMU driver which is based on pvr
- generic_events_valid_test varies based on platform
- bhrb_filter_map_test: again its dependent on pmu to
  decide which bhrb filter to use
- reserved_bits_mmcra_sample_elig_mode: randome sampling
  mode reserved bits is also varies based on platform

Signed-off-by: Athira Rajeev 

I have tested the patches on PowerPC by compiling and running pmu selftest.

For the series:
Tested-by: Disha Goel

---
Changelog:
 v1 -> v2
 No code changes. Rebased to latest upstream

 .../pmu/event_code_tests/event_alternatives_tests_p10.c| 3 ++-
 .../pmu/event_code_tests/generic_events_valid_test.c   | 3 ++-
 .../reserved_bits_mmcra_sample_elig_mode_test.c| 3 ++-
 .../powerpc/pmu/sampling_tests/bhrb_filter_map_test.c  | 7 +--
 4 files changed, 11 insertions(+), 5 deletions(-)

diff --git 
a/tools/testing/selftests/powerpc/pmu/event_code_tests/event_alternatives_tests_p10.c
 
b/tools/testing/selftests/powerpc/pmu/event_code_tests/event_alternatives_tests_p10.c
index 8be7aada6523..355f8bbe06c3 100644
--- 
a/tools/testing/selftests/powerpc/pmu/event_code_tests/event_alternatives_tests_p10.c
+++ 
b/tools/testing/selftests/powerpc/pmu/event_code_tests/event_alternatives_tests_p10.c
@@ -26,6 +26,7 @@ static int event_alternatives_tests_p10(void)
 {
struct event *e, events[5];
int i;
+   int pvr = PVR_VER(mfspr(SPRN_PVR));

/* Check for platform support for the test */
SKIP_IF(platform_check_for_tests());
@@ -36,7 +37,7 @@ static int event_alternatives_tests_p10(void)
 * code and using PVR will work correctly for all cases
 * including generic compat mode.
 */
-   SKIP_IF(PVR_VER(mfspr(SPRN_PVR)) != POWER10);
+   SKIP_IF((pvr != POWER10) && (pvr != POWER11));

SKIP_IF(check_for_generic_compat_pmu());

diff --git 
a/tools/testing/selftests/powerpc/pmu/event_code_tests/generic_events_valid_test.c
 
b/tools/testing/selftests/powerpc/pmu/event_code_tests/generic_events_valid_test.c
index 0d237c15d3f2..a378fa9a5a7b 100644
--- 
a/tools/testing/selftests/powerpc/pmu/event_code_tests/generic_events_valid_test.c
+++ 
b/tools/testing/selftests/powerpc/pmu/event_code_tests/generic_events_valid_test.c
@@ -17,6 +17,7 @@
 static int generic_events_valid_test(void)
 {
struct event event;
+   int pvr = mfspr(SPRN_PVR);

/* Check for platform support for the test */
SKIP_IF(platform_check_for_tests());
@@ -31,7 +32,7 @@ static int generic_events_valid_test(void)
 * - PERF_COUNT_HW_STALLED_CYCLES_BACKEND
 * - PERF_COUNT_HW_REF_CPU_CYCLES
 */
-   if (PVR_VER(mfspr(SPRN_PVR)) == POWER10) {
+   if ((pvr == POWER10) || (pvr == POWER11)) {
event_init_opts(&event, PERF_COUNT_HW_CPU_CYCLES, PERF_TYPE_HARDWARE, 
"event");
FAIL_IF(event_open(&event));
event_close(&event);
diff --git 
a/tools/testing/selftests/powerpc/pmu/event_code_tests/reserved_bits_mmcra_sample_elig_mode_test.c
 
b/tools/testing/selftests/powerpc/pmu/event_code_tests/reserved_bits_mmcra_sample_elig_mode_test.c
index 4c119c821b99..7bb26a232fbe 100644
--- 
a/tools/testing/selftests/powerpc/pmu/event_code_tests/reserved_bits_mmcra_sample_elig_mode_test.c
+++ 
b/tools/testing/selftests/powerpc/pmu/event_code_tests/reserved_bits_mmcra_sample_elig_mode_test.c
@@ -21,6 +21,7 @@
 static int reserved_bits_mmcra_sample_elig_mode(void)
 {
struct event event;
+   int pvr = PVR_VER(mfspr(SPRN_PVR));

/* Check for platform support for the test */
SKIP_IF(platform_check_for_tests());
@@ -59,7 +60,7 @@ static int reserved_bits_mmcra_sample_elig_mode(void)
 * is reserved in power10 and 0xC is reserved in
 * power9.
 */
-   if (PVR_VER(mfspr(SPRN_PVR)) == POWER10) {
+   if ((pvr == POWER10) || (pvr == POWER11)) {
event_init(&event, 0x100401e0);
FAIL_IF(!event_open(&event));
} else if (PVR_VER(mfspr(SPRN_PVR)) == POWER9) {
diff --git 
a/tools/testing/selftests/powerpc/pmu/sampling_tests/bhrb_filter_map_test.c 
b/tools/testing/selftests/powerpc/pmu/sampling_tests/bhrb_filter_map_test.c
index 3f43c315c666..64ab9784f9b1 100644
--- a/tools/testing/selftests/powerpc/pmu/sampling_tests/bhrb_filter_map_test.c
+++ b/tools/testing/selftests/powerpc/pmu/sampling_tests/bhrb_filter_map_test.c
@@ -83,13 +83,16 @@ static int bhrb_filter_map_test(void)
 * using PVR will work correctly for all cases including generic
 * compat mode.
 */
-   if (PVR_VER(mfspr(SPRN_PVR)) == POWER10) {
+   switch (PVR_VER(mfspr(SPRN_PVR))) {
+   case POWER11:
+   case POWER10:
for (i = 0; i &