On Tue, Apr 02, 2019 at 06:18:53AM -0700, Paul E. McKenney wrote: > On Tue, Apr 02, 2019 at 09:09:53AM +0200, Peter Zijlstra wrote: > > On Mon, Apr 01, 2019 at 10:22:57AM -0700, Paul E. McKenney wrote:
> > > Or am I missing something that gets the scheduler on the job faster? > > > > Oh urgh, yah. So normally we only twiddle with the need_resched state: > > > > - while preempt_disabl(), such that preempt_enable() will reschedule > > - from interrupt context, such that interrupt return will reschedule > > > > But the usage here 'violates' those rules and then there is an > > unspecified latency between setting the state and it getting observed, > > but no longer than 1 tick I would think. > > In general, yes, which is fine (famous last words) for normal grace > periods but not so good for expedited grace periods. > > > I don't think we can go NOHZ with need_resched set, because the moment > > we hit the idle loop with that set, we _will_ reschedule. > > Agreed, and I believe that transitioning to usermode execution also > gives the scheduler a chance to take action. > > The one exception to this is when a nohz_full CPU running in nohz_full > mode does a system call that decides to execute for a very long time. > Last I checked, the scheduling-clock interrupt did -not- get retriggered > in this case, and the delay could be indefinite, as in bad even for > normal grace periods. Right, there is that. > > So in that respect the irq_work suggestion I made would fix things > > properly. > > But wouldn't the current use of set_tsk_need_resched(current) followed by > set_preempt_need_resched() work just as well in that case? The scheduler > would react to these at the next scheduler-clock interrupt on their > own, right? Or am I being scheduler-naive again? Well, you have that unspecified delay. By forcing the (self) interrupt you enforce a timely response.