Hello Dhananjay,

On 6/24/2024 11:28 AM, Dhananjay Ugwekar wrote:
Currently the energy-cores event in the power PMU aggregates energy
consumption data at a package level. On the other hand the core energy
RAPL counter in AMD CPUs has a core scope (which means the energy
consumption is recorded separately for each core). Earlier efforts to add
the core event in the power PMU had failed [1], due to the difference in
the scope of these two events. Hence, there is a need for a new core scope
PMU.

This patchset adds a new "power_per_core" PMU alongside the existing
"power" PMU, which will be responsible for collecting the new
"energy-per-core" event.

Tested the package level and core level PMU counters with workloads
pinned to different CPUs.

Results with workload pinned to CPU 1 in Core 1 on an AMD Zen4 Genoa
machine:

$ perf stat -a --per-core -e power_per_core/energy-per-core/ -- sleep 1

  Performance counter stats for 'system wide':

S0-D0-C0         1          0.02 Joules power_per_core/energy-per-core/
S0-D0-C1         1          5.72 Joules power_per_core/energy-per-core/
S0-D0-C2         1          0.02 Joules power_per_core/energy-per-core/
S0-D0-C3         1          0.02 Joules power_per_core/energy-per-core/
S0-D0-C4         1          0.02 Joules power_per_core/energy-per-core/
S0-D0-C5         1          0.02 Joules power_per_core/energy-per-core/
S0-D0-C6         1          0.02 Joules power_per_core/energy-per-core/
S0-D0-C7         1          0.02 Joules power_per_core/energy-per-core/
S0-D0-C8         1          0.02 Joules power_per_core/energy-per-core/
S0-D0-C9         1          0.02 Joules power_per_core/energy-per-core/
S0-D0-C10        1          0.02 Joules power_per_core/energy-per-core/

Tested a bunch of scenarios on my 2P 3rd Generation EPYC server and this
time around I'm seeing the expected behavior. I'll leave some of
scenarios I've tested below:

  $ for i in `seq 0 63`; do taskset -c $i loop & done
  $ sudo perf stat -a --per-core -e power_per_core/energy-per-core/ -- sleep 5

  S0-D0-C0              1              10.82 Joules 
power_per_core/energy-per-core/
  S0-D0-C1              1              10.87 Joules 
power_per_core/energy-per-core/
  S0-D0-C2              1              10.86 Joules 
power_per_core/energy-per-core/
  S0-D0-C3              1              10.89 Joules 
power_per_core/energy-per-core/
  S0-D0-C4              1              10.91 Joules 
power_per_core/energy-per-core/
  ...
  S0-D0-C63             1              11.03 Joules 
power_per_core/energy-per-core/
  S1-D1-C0              1               0.19 Joules 
power_per_core/energy-per-core/
  S1-D1-C1              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C2              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C3              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C4              1               0.00 Joules 
power_per_core/energy-per-core/
  ...
  S1-D1-C63             1               0.00 Joules 
power_per_core/energy-per-core/

  $ for i in `seq 64 127`; do taskset -c $i loop & done
  $ sudo perf stat -a --per-core -e power_per_core/energy-per-core/ -- sleep 5

  S0-D0-C0              1               0.17 Joules 
power_per_core/energy-per-core/
  S0-D0-C1              1               0.00 Joules 
power_per_core/energy-per-core/
  S0-D0-C2              1               0.00 Joules 
power_per_core/energy-per-core/
  S0-D0-C3              1               0.00 Joules 
power_per_core/energy-per-core/
  S0-D0-C4              1               0.00 Joules 
power_per_core/energy-per-core/
  ...
  S0-D0-C63             1               0.01 Joules 
power_per_core/energy-per-core/
  S1-D1-C0              1              10.51 Joules 
power_per_core/energy-per-core/
  S1-D1-C1              1              10.50 Joules 
power_per_core/energy-per-core/
  S1-D1-C2              1              10.52 Joules 
power_per_core/energy-per-core/
  S1-D1-C3              1              10.51 Joules 
power_per_core/energy-per-core/
  S1-D1-C4              1              10.51 Joules 
power_per_core/energy-per-core/
  ...
  S1-D1-C63             1              10.59 Joules 
power_per_core/energy-per-core/

  $ for i in `seq 0 15`; do taskset -c $i loop & done
  $ sudo perf stat -a --per-core -e power_per_core/energy-per-core/ -- sleep 5

  S0-D0-C0              1              11.16 Joules 
power_per_core/energy-per-core/
  S0-D0-C1              1              11.21 Joules 
power_per_core/energy-per-core/
  S0-D0-C2              1              11.20 Joules 
power_per_core/energy-per-core/
  S0-D0-C3              1              11.24 Joules 
power_per_core/energy-per-core/
  S0-D0-C4              1              11.25 Joules 
power_per_core/energy-per-core/
  S0-D0-C5              1              11.26 Joules 
power_per_core/energy-per-core/
  S0-D0-C6              1              11.25 Joules 
power_per_core/energy-per-core/
  S0-D0-C7              1              11.25 Joules 
power_per_core/energy-per-core/
  S0-D0-C8              1              11.42 Joules 
power_per_core/energy-per-core/
  S0-D0-C9              1              11.43 Joules 
power_per_core/energy-per-core/
  S0-D0-C10             1              11.47 Joules 
power_per_core/energy-per-core/
  S0-D0-C11             1              11.43 Joules 
power_per_core/energy-per-core/
  S0-D0-C12             1              11.44 Joules 
power_per_core/energy-per-core/
  S0-D0-C13             1              11.41 Joules 
power_per_core/energy-per-core/
  S0-D0-C14             1              11.40 Joules 
power_per_core/energy-per-core/
  S0-D0-C15             1              11.41 Joules 
power_per_core/energy-per-core/
  S0-D0-C16             1               0.33 Joules 
power_per_core/energy-per-core/
  ...
  S0-D0-C63             1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C0              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C1              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C2              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C3              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C4              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C5              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C6              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C7              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C8              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C9              1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C10             1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C11             1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C12             1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C13             1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C14             1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C15             1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C16             1               0.00 Joules 
power_per_core/energy-per-core/
  ...
  S1-D1-C63             1               0.01 Joules 
power_per_core/energy-per-core/

  $ for i in `seq 0 7` `seq 128 131` `seq 64 71` `seq 192 199`; do taskset -c $i 
loop & done
  $ sudo perf stat -a --per-core -e power_per_core/energy-per-core/ -- sleep 5

  S0-D0-C0              1              18.68 Joules 
power_per_core/energy-per-core/
  S0-D0-C1              1              18.20 Joules 
power_per_core/energy-per-core/
  S0-D0-C2              1              18.27 Joules 
power_per_core/energy-per-core/
  S0-D0-C3              1              18.41 Joules 
power_per_core/energy-per-core/
  S0-D0-C4              1              16.94 Joules 
power_per_core/energy-per-core/
  S0-D0-C5              1              16.95 Joules 
power_per_core/energy-per-core/
  S0-D0-C6              1              16.92 Joules 
power_per_core/energy-per-core/
  S0-D0-C7              1              16.94 Joules 
power_per_core/energy-per-core/
  S0-D0-C8              1               0.39 Joules 
power_per_core/energy-per-core/
  ...
  S0-D0-C63             1               0.00 Joules 
power_per_core/energy-per-core/
  S1-D1-C0              1              18.59 Joules 
power_per_core/energy-per-core/
  S1-D1-C1              1              18.39 Joules 
power_per_core/energy-per-core/
  S1-D1-C2              1              17.50 Joules 
power_per_core/energy-per-core/
  S1-D1-C3              1              18.29 Joules 
power_per_core/energy-per-core/
  S1-D1-C4              1              18.58 Joules 
power_per_core/energy-per-core/
  S1-D1-C5              1              17.62 Joules 
power_per_core/energy-per-core/
  S1-D1-C6              1              17.75 Joules 
power_per_core/energy-per-core/
  S1-D1-C7              1              17.53 Joules 
power_per_core/energy-per-core/
  S1-D1-C8              1               0.00 Joules 
power_per_core/energy-per-core/
  ...
  S1-D1-C63             1               0.00 Joules 
power_per_core/energy-per-core/

Unlike last time, each socket is reporting accurate values for all the
scenarios I've tried above.


[1]: 
https://lore.kernel.org/lkml/3e766f0e-37d4-0f82-3868-31b142288...@linux.intel.com/

This patchset applies cleanly on top of v6.10-rc4 as well as latest
tip/master.

P.S. I tested this on top of v6.10-rc4 this time around.

Tested-by: K Prateek Nayak <kprateek.na...@amd.com>


v3 changes:
* Patch 1 added to introduce the logical_core_id which is unique across
   the system (Prateek)
* Use the unique topology_logical_core_id() instead of
   topology_core_id() (which is only unique within a package on tested
   AMD and Intel systems) in Patch 10

[..snip..]


--
Thanks and Regards,
Prateek

Reply via email to