* John Stultz <john.stu...@linaro.org> wrote: > [...] > > I'd much rather see perf export CLOCK_MONOTONIC_RAW timestamps, > since that clockid is well defined. [...]
So the problem with that clock is that it does the following for every timestamp: cycle_now = clock->read(clock); ... which is impossibly slow if something like the HPET is used, which is rather common - so this is a non-starter to timestamp perf events with. We use the scheduler clock as a reasonable compromise between scalability and clock globality. I can see two solutions: 1) One approach is what I described in my other reply a few minutes ago: track the flow of GTOD, timestamped with the fast perf timestamps, so that GTOD can be correlated to the perf clock, if user-space so wishes. The correlation is simple so this gets close to the ease of use of being able to timestamp GTOD directly. (That would be useful for other purposes as well, such as instrumenting GTOD updates.) 2) An alternate, rather interesting approach would be to change the scheduler clock offset to be influenced by the above events, so that it quasi-approximates GTOD and emits natural time of day timestamps. This already happens partially in the sched-clock slow path, kernel/sched/clock.c's sched_clock_local(), it uses scd->tick_gtod timestamps to correlate to the monotonic clock. This could be changed over to use not get_ktime() but getnstimeofday(), to get true TOD timestamps. The trickier bit is the x86 fast-path, in arch/x86/kernel/tsc.c's native_sched_clock(). That relies on __cycles_2_ns() to transform a CPU cycles timestamp into (boot time offset) nanoseconds. For that it uses the cyc2ns_offset percpu variable. That variable could be updated periodically so that it's TOD offset. My (strong!) preference would be #2, for the simple reason that it would make perf timestamps instantly usable and tooling wouldn't have to do anything to get true timestamps. We could add a new PERF_SAMPLE_TIME_OF_DAY feature bit so that user-space can consciously request GTOD timestamps. This feature bit could even be arch influenced, so that architectures could convert their perf clocks at the pace they desire - which tooling can detect and handle safely. Thoughts? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/