On 12/7/2017 6:18 AM, Robert Bragg wrote:
On Wed, Nov 15, 2017 at 12:13 PM, Sagar Arun Kamble
<sagar.a.kam...@intel.com <mailto:sagar.a.kam...@intel.com>> wrote:
We can compute system time corresponding to GPU timestamp by taking a
reference point (CPU monotonic time, GPU timestamp) and then adding
delta time computed using timecounter/cyclecounter support in kernel.
We have to configure cyclecounter with the GPU timestamp frequency.
Earlier approach that was based on cross-timestamp is not needed. It
was being used to approximate the frequency based on invalid
assumptions
(possibly drift was being seen in the time due to precision issue).
The precision of time from GPU clocks is already in ns and timecounter
takes care of it as verified over variable durations.
Hi Sagar,
I have some doubts about this analysis...
The intent behind Sourab's original approach was to be able to
determine the frequency at runtime empirically because the constants
we have aren't particularly accurate. Without a perfectly stable
frequency that's known very precisely then an interpolated correlation
will inevitably drift. I think the nature of HW implies we can't
expect to have either of those. Then the general idea had been to try
and use existing kernel infrastructure for a problem which isn't
unique to GPU clocks.
Hi Robert,
Testing on SKL shows timestamps drift only about 10us for sampling done
in kernel for about 30min time.
Verified with changes from
https://github.com/sakamble/i915-timestamp-support/commits/drm-tip
Note that since we are sampling counter in debugfs, there is likely
overhead of read that is adding to drift so adjustment might be needed.
But with OA reports we just have to worry about initial timecounter
setup where we need accurate pair of system time and GPU timestamp clock
counts.
I think timestamp clock is highly stable and we don't need logic to
determine frequency at runtime. Will try to get confirmation from HW
team as well.
If we need to determine the frequency, Sourab's approach needs to refined as
1. It can be implemented entirely in i915 because what we need is pair
of system time and gpu clocks over different durations.
2. crosstimestamp framework usage in that approach is incorrect as
ideally we should be sending ART counter and GPU counter. Instead we were
hacking to send the TSC clock.
Quoting Thomas from https://patchwork.freedesktop.org/patch/144298/
get_device_system_crosststamp() is for timestamps taken via a clock
which is directly correlated with the timekeeper clocksource.
ART and TSC are correlated via: TSC = (ART * scale) + offset
get_device_system_crosststamp() invokes the device function which reads
ART, which is converted to CLOCK_MONOTONIC_RAW by the conversion above,
and then uses interpolation to map the CLOCK_MONOTONIC_RAW value to
CLOCK_MONOTONIC.
The device function does not know anything about TSC. All it knows about
is ART.
I am not aware if GPU timestamp clock is correlated with TSC like ART
for ethernet drivers and if i915 can read ART like ethernet drivers.
3. I have seen precision issues in the calculations in
i915_perf_clock_sync_work and usage of MONO_RAW which might jump time.
That's not to say that a more limited, simpler solution based on
frequent re-correlation wouldn't be more than welcome if tracking an
accurate frequency is too awkward for now
Adjusting timecounter time can be another option if we confirm that GPU
timestamp frequency is stable.
, but I think some things need to be considered in that case:
- It would be good to quantify the kind of drift seen in practice to
know how frequently it's necessary to re-synchronize. It sounds like
you've done this ("as verified over variable durations") so I'm
curious what kind of drift you saw. I'd imagine you would see a
significant drift over, say, one second and it might not take much
longer for the drift to even become clearly visible to the user when
plotted in a UI. For reference I once updated the arb_timer_query test
in piglit to give some insight into this drift
(https://lists.freedesktop.org/archives/piglit/2016-September/020673.html)
and at least from what I wrote back then it looks like I was seeing a
drift of a few milliseconds per second on SKL. I vaguely recall it
being much worse given the frequency constants we had for Haswell.
On SKL I have seen very small drift of less than 10us over a period of
30 minutes.
Verified with changes from
https://github.com/sakamble/i915-timestamp-support/commits/drm-tip
36bit counter will overflow in about 95min at 12mhz and timecounter
framework considers
counter value with delta from timecounter init of more than half of
total time covered by counter as time in the past so current approach
works for less than 45min.
Will need to add overflow watchdog support like other drivers which just
reinitializes timecounter prior to 45min.
- What guarantees will be promised about monotonicity of correlated
system timestamps? Will it be guaranteed that sequential reports must
have monotonically increasing timestamps? That might be fiddly if the
gpu + system clock are periodically re-correlated, so it might be good
to be clear in documentation that the correlation is best-effort only
for the sake of implementation simplicity. That would still be good
for a lot of UIs I think and there's freedom for the driver to start
simple and potentially improve later by measuring the gpu clock
frequency empirically.
If we rely on timecounter alone without correlation to know frequency,
setting init time as MONOTONIC system time should take care of
monotonicity of correlated times.
Regards,
Sagar
Currently only one correlated pair of timestamps is read when enabling
the stream and so a relatively long time is likely to pass before the
stream is disabled (seconds, minutes while a user is running a system
profiler) . It seems very likely to me that these clocks are going to
drift significantly without introducing some form of periodic
re-synchronization based on some understanding of the drift that's seen.
Br,
- Robert
This series adds base timecounter/cyclecounter changes and changes to
get GPU and CPU timestamps in OA samples.
Sagar Arun Kamble (1):
drm/i915/perf: Add support to correlate GPU timestamp with
system time
Sourab Gupta (3):
drm/i915/perf: Add support for collecting 64 bit timestamps with OA
reports
drm/i915/perf: Extract raw GPU timestamps from OA reports
drm/i915/perf: Send system clock monotonic time in perf samples
drivers/gpu/drm/i915/i915_drv.h | 11 ++++
drivers/gpu/drm/i915/i915_perf.c | 124
++++++++++++++++++++++++++++++++++++++-
drivers/gpu/drm/i915/i915_reg.h | 6 ++
include/uapi/drm/i915_drm.h | 14 +++++
4 files changed, 154 insertions(+), 1 deletion(-)
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
<mailto:Intel-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
<https://lists.freedesktop.org/mailman/listinfo/intel-gfx>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx