Nicholas Piggin <npig...@gmail.com> writes: > Excerpts from Athira Rajeev's message of July 11, 2021 10:25 pm: >> During Live Partition Migration (LPM), it is observed that perf >> counter values reports zero post migration completion. However >> 'perf stat' with workload continues to show counts post migration >> since PMU gets disabled/enabled during sched switches. But incase >> of system/cpu wide monitoring, zero counts were reported with 'perf >> stat' after migration completion. >> >> Example: >> ./perf stat -e r1001e -I 1000 >> time counts unit events >> 1.001010437 22,137,414 r1001e >> 2.002495447 15,455,821 r1001e >> <<>> As seen in next below logs, the counter values shows zero >> after migration is completed. >> <<>> >> 86.142535370 129,392,333,440 r1001e >> 87.144714617 0 r1001e >> 88.146526636 0 r1001e >> 89.148085029 0 r1001e >> >> Here PMU is enabled during start of perf session and counter >> values are read at intervals. Counters are only disabled at the >> end of session. The powerpc mobility code presently does not handle >> disabling and enabling back of PMU counters during partition >> migration. Also since the PMU register values are not saved/restored >> during migration, PMU registers like Monitor Mode Control Register 0 >> (MMCR0), Monitor Mode Control Register 1 (MMCR1) will not contain >> the value it was programmed with. Hence PMU counters will not be >> enabled correctly post migration. >> >> Fix this in mobility code by handling disabling and enabling of >> PMU in all cpu's before and after migration. Patch introduces two >> functions 'mobility_pmu_disable' and 'mobility_pmu_enable'. >> mobility_pmu_disable() is called before the processor threads goes >> to suspend state so as to disable the PMU counters. And disable is >> done only if there are any active events running on that cpu. >> mobility_pmu_enable() is called after the processor threads are >> back online to enable back the PMU counters. >> >> Since the performance Monitor counters ( PMCs) are not >> saved/restored during LPM, results in PMC value being zero and the >> 'event->hw.prev_count' being non-zero value. This causes problem > > Interesting. Are they defined to not be migrated, or may not be > migrated?
PAPR may be silent on this... at least I haven't found anything yet. But I'm not very familiar with perf counters. How much assurance do we have that hardware events we've programmed on the source can be reliably re-enabled on the destination, with the same semantics? Aren't there some model-specific counters that don't make sense to handle this way? >> diff --git a/arch/powerpc/include/asm/rtas.h >> b/arch/powerpc/include/asm/rtas.h >> index 9dc97d2..cea72d7 100644 >> --- a/arch/powerpc/include/asm/rtas.h >> +++ b/arch/powerpc/include/asm/rtas.h >> @@ -380,5 +380,13 @@ static inline void rtas_initialize(void) { } >> static inline void read_24x7_sys_info(void) { } >> #endif >> >> +#ifdef CONFIG_PPC_PERF_CTRS >> +void mobility_pmu_disable(void); >> +void mobility_pmu_enable(void); >> +#else >> +static inline void mobility_pmu_disable(void) { } >> +static inline void mobility_pmu_enable(void) { } >> +#endif >> + >> #endif /* __KERNEL__ */ >> #endif /* _POWERPC_RTAS_H */ > > It's not implemented in rtas, maybe consider putting this into a perf > header? +1