On 01/22/2013 11:57 AM, Jason Gunthorpe wrote:
On Mon, Jan 21, 2013 at 10:46:31AM -0800, John Stultz wrote:
What I'd propose is that we break the read_persistent_clock()
functionality up. So we need two interfaces:
1) An interface to access a time value we used to initialize the
system's CLOCK_REALTIME time.
2) An interface to measure the length of suspend.
Interface #1 could be possibly just replaced with the RTCTOSYS
functionality. Although the downside there is that for some time at
bootup between the timekeeping_init() function running (prior to
interrupts being enabled) and the RTC driver being available (after
interrupts are enabled), where we'd have an incorrect system clock.
So we may want to preserve something like the existing
read_persistent_clock() interface, but as Jason suggested, we could
push that access into the RTC driver itself.
How big of an issue is this? Could the RTCTOSYS function be moved to
the moment the RTC driver is registered rather than using a
late_initcall?
It may not be huge. Most early boot items are going to be
CLOCK_MONOTONIC based, which would be unaffected. So that's a potential
solution, but I'm hesitant to claim there'd be no side effects.
Interface #2 could then be either RTC based, or countinuous counter
based. Since we still want to do this measurement with interrupts
off, we still would need that interrupt-free RTC method like
read_persistent_clock() where supported (falling back to the RTC
driver's suspend/resume handler to try to fix things up as best it
can if that's not available).
Could the counter version of this be bundled into the clocksource
framework? It already has generic APIs for handling cycle counters and
things. Isn't there a TSC driver for clocksource already? Is all that
is missing is a way to tell if the counter survived suspend?
So without *major* rework, I'd rather not do this. Again, the
clocksource code has quite a few assumptions built in that are optimized
for timekeeping (where we avoid overflows by expecting relatively
frequent updates), and very different approaches are needed for
something like suspend (where valid suspend times could be potentially
months to years).
That said, given ARM's use of clocksources for sched_clock, there may be
some reasonable claim to splitting the clocksource mult/shift selection
management away from the clocksource itself (instead having it managed
by the timekeeping core). That would allow other subsystems to use the
clocksource accessor function, managing their own mult/shift selection
trade-offs independently.
But such an effort would be of substantial size, given how invasive it
would be. So I'm not sure this is likely to happen in the near term
without a dedicated effort.
So I think the smaller effort of splitting up the
read_persistent_clock() and making it more reasonably handle counters
like this S3 non-stop TSC and the 32k counter is a more reasonable
mid-step (especially since there will still be the same logical division
from a timekeeping perspective between runtime clocksources and
suspend-measuring clocksources if we were to do the major overhaul
eventually).
clocksource already has suspend/resume callbacks stuff, so the counter
driver could sense if the sleep was too deep and mark itself as
invalid.
But at that point you've lost time. If this was all centrally
controlled, you have to know before hand what the bounds would be. With
the TSC, we know it won't wrap around our starting measurement for at
least 10 years. That's a reasonable range for suspend. We don't want to
resume and just get a "oh, bad call, you picked the wrong clocksource
for such a long suspend", and really without the clocksource checking
with the RTC I don't think it can even know if its been too long itself
(since maybe the counter wrapped, but maybe not).
This would help solve the problem on ARM with muxing persistent clock
on multi-platform.
A RTC device flag 'readable with interrupts off' still seems like a
good idea for the RTC case..
Yea, I think this point you're probably right, as I'm warming to the
idea of pushing the existing read_persistent_clock() into the rtc device
code. Just need to make sure not initializing CLOCK_REALTIME at
timekeeping_init isn't going to have negative effects.
thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/