On Fri, Jul 06, 2018 at 06:29:05PM +0200, Peter Zijlstra wrote: > On Fri, Jul 06, 2018 at 03:53:30PM +0100, David Woodhouse wrote: > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > index e4d4e60..89f5814 100644 > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -1616,7 +1616,8 @@ static inline int spin_needbreak(spinlock_t *lock) > > > > static __always_inline bool need_resched(void) > > { > > - return unlikely(tif_need_resched()); > > + return unlikely(tif_need_resched()) || > > + rcu_urgent_qs_requested(); > > } > > Instead of making need_resched() touch two cachelines, I think I would > prefer adding resched_cpu() to rcu_request_urgent_qs_task().
I used to do something like this, but decided that whacking each holdout CPU over the head ten times a second was a bit much. > The preempt state is alread a bit complicated and shadowed in the > preempt_count (on some architectures) adding additional bits to it like > this is just asking for trouble. How about a separate need_resched_rcu() that includes the extra cache miss? Or open-coding the rcu_urgent_qs_requested()? Thanx, Paul