Hi,
On Mon, Aug 28, 2017 at 1:41 PM, Andi Kleen <a...@firstfloor.org> wrote: >> So I think we are good to go. to capture multiplexing scaling factor >> when sampling simply use the S >> modifier. >> But to my surprise, newer kernels are not happy with the cmdline: >> $ perf record -e cycles:S noploop 1 >> Error: >> The sys_perf_event_open() syscall returned with 22 (Invalid argument) >> for event (cycles:Su). >> /bin/dmesg may provide additional information. >> No CONFIG_PERF_EVENTS=y kernel support configured? > > Likely due to > > ba5213ae6b88 perf/core: Correct event creation with PERF_FORMAT_GROUP > > It's not supported with inherited events. > Yes, and other things have changed as well. I did a bit of research to figure out how to make this work out-of the-box with the latest perf (v4.13). It turns out you need to combine multiple options and an event modifier. This is quite cumbersome but here it is: $ perf record --no-inherit --running-time -e cycles:S ........ You need: - no-inherit: the kernel does not know how to deal with multiplexing when events are inherited - running-time: this used to be automatic for PERF_SAMPLE_READ with perf record, now it is not This includes TIME_ENABLED/TIME_RUNNING in the sample_read format. - :S : to add a PERF_SAMPLE_READ to each sample, it encapsulates the event value + timings. We do not care about the value but are only interested in the timings. The kernel cannot record the timings without a PERF_SAMPLE_READ. I am also surprised to see that perf record keep inherit=1 in system-wide mode. I don't think this is relavant in this mode. But the kernel this fails in this case, which I think is a bug. In system-wide mode, the attr-.no_inherit should be ignored. We can fix perf record to avoid this in system-wide. The cmdline above works for both per-thread and system-wide modes. So I think we do not need my patch or variations thereof, everything is there, though a bit difficult to combine.