On Sat, Apr 17, 2021 at 06:51:08PM +0200, Thomas Gleixner wrote: > On Sat, Apr 17 2021 at 18:24, Thomas Gleixner wrote: > > On Fri, Apr 16 2021 at 13:13, Peter Xu wrote: > >> On Fri, Apr 16, 2021 at 01:00:23PM -0300, Marcelo Tosatti wrote: > >>> > >>> +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \ > >>> + (1U << HRTIMER_BASE_REALTIME_SOFT) | \ > >>> + (1U << HRTIMER_BASE_TAI) | \ > >>> + (1U << HRTIMER_BASE_TAI_SOFT)) > >>> + > >>> +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base) > >>> +{ > >>> + if (cpu_base->softirq_activated) > >>> + return true; > >> > >> A pure question on whether this check is needed... > >> > >> Here even if softirq_activated==1 (as softirq is going to happen), as long > >> as > >> (cpu_base->active_bases & CLOCK_SET_BASES)==0, shouldn't it already mean > >> that > >> "yes indeed clock was set, but no need to kick this cpu as no relevant > >> timer"? > >> As that question seems to be orthogonal to whether a softirq is going to > >> trigger on that cpu. > > > > That's correct and it's not any different from firing the IPI because in > > both cases the update happens with the base lock of the CPU in question > > held. And if there are no active timers in any of the affected bases, > > then there is no need to reevaluate the next expiry because the offset > > update does not affect any armed timers. It just makes sure that the > > next enqueu of a timer on such a base will see the the correct offset. > > > > I'll just zap it. > > But the whole thing is still wrong in two aspects: > > 1) BOOTTIME can be one of the affected clocks when sleep time > (suspended time) is injected because that uses the same mechanism. > > Sorry for missing that earlier when I asked to remove it, but > that's trivial to fix by adding the BOOTTIME base back. > > 2) What's worse is that on resume this might break because that > mechanism is also used to enforce the reprogramming of the clock > event devices and there we cannot be selective on clock bases. > > I need to dig deeper into that because suspend/resume has changed > a lot over time, so this might be just a historical leftover. But > without proper analysis we might end up with subtle and hard to > debug wreckage. > > Thanks, > > tglx
Thomas, There is no gain in avoiding the IPIs for the suspend/resume case (since suspending is a large interruption anyway). To avoid the potential complexity (and associated bugs), one option would be to NOT skip IPIs for the resume case. Sending -v6 with that (and other suggestions/fixes).