On Sat, Apr 17, 2021 at 06:51:08PM +0200, Thomas Gleixner wrote:
> On Sat, Apr 17 2021 at 18:24, Thomas Gleixner wrote:
> > On Fri, Apr 16 2021 at 13:13, Peter Xu wrote:
> >> On Fri, Apr 16, 2021 at 01:00:23PM -0300, Marcelo Tosatti wrote:
> >>>  
> >>> +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) | \
> >>> +                  (1U << HRTIMER_BASE_REALTIME_SOFT) |   \
> >>> +                  (1U << HRTIMER_BASE_TAI) |             \
> >>> +                  (1U << HRTIMER_BASE_TAI_SOFT))
> >>> +
> >>> +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base)
> >>> +{
> >>> + if (cpu_base->softirq_activated)
> >>> +         return true;
> >>
> >> A pure question on whether this check is needed...
> >>
> >> Here even if softirq_activated==1 (as softirq is going to happen), as long 
> >> as
> >> (cpu_base->active_bases & CLOCK_SET_BASES)==0, shouldn't it already mean 
> >> that
> >> "yes indeed clock was set, but no need to kick this cpu as no relevant 
> >> timer"?
> >> As that question seems to be orthogonal to whether a softirq is going to
> >> trigger on that cpu.
> >
> > That's correct and it's not any different from firing the IPI because in
> > both cases the update happens with the base lock of the CPU in question
> > held. And if there are no active timers in any of the affected bases,
> > then there is no need to reevaluate the next expiry because the offset
> > update does not affect any armed timers. It just makes sure that the
> > next enqueu of a timer on such a base will see the the correct offset.
> >
> > I'll just zap it.
> 
> But the whole thing is still wrong in two aspects:
> 
>     1) BOOTTIME can be one of the affected clocks when sleep time
>        (suspended time) is injected because that uses the same mechanism.
> 
>        Sorry for missing that earlier when I asked to remove it, but
>        that's trivial to fix by adding the BOOTTIME base back.
> 
>     2) What's worse is that on resume this might break because that
>        mechanism is also used to enforce the reprogramming of the clock
>        event devices and there we cannot be selective on clock bases.
> 
>        I need to dig deeper into that because suspend/resume has changed
>        a lot over time, so this might be just a historical leftover. But
>        without proper analysis we might end up with subtle and hard to
>        debug wreckage.
> 
> Thanks,
> 
>         tglx

Thomas,

There is no gain in avoiding the IPIs for the suspend/resume case 
(since suspending is a large interruption anyway). To avoid 
the potential complexity (and associated bugs), one option would 
be to NOT skip IPIs for the resume case.

Sending -v6 with that (and other suggestions/fixes).

Reply via email to