On Mon, Jan 16, 2017 at 06:11:30PM +0100, Peter Zijlstra wrote: > On Sat, Jan 14, 2017 at 01:13:12AM -0800, Paul E. McKenney wrote: > > There is some confusion as to which of cond_resched() or > > cond_resched_rcu_qs() should be added to long in-kernel loops. > > This commit therefore eliminates the decision by adding RCU > > quiescent states to cond_resched(). > > Which would make: rcu_read_lock(); cond_resched(); rcu_read_unlock(); > invalid under preemptible RCU. Is it already?
In theory, yes. In practice, I just tested it with preemption and lockdep enabled, and it didn't complain. If further testing finds complaints, we can either fix those uses (preferred) or revert this patch. > > Warning: This is a prototype. For example, it does not correctly > > handle Tasks RCU. Which is OK for the moment, given that no one > > actually uses Tasks RCU yet. > > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -4907,6 +4907,7 @@ int __sched _cond_resched(void) > > preempt_schedule_common(); > > return 1; > > } > > + rcu_all_qs(); > > return 0; > > } > > Still not a real fan of this, it does make cond_resched() touch a bunch > more cachelines, also, I suppose that if we're going to do this, we > should make __cond_resched_lock() and __cond_resched_softirq() act > similarly. Michal (now CCed) argues that having to distinguish between cond_resched() and cond_resched_rcu_qs() is overly burdensome. Michal? Any thoughts on how we might remove this burden without the additional cache misses? I will take another look as well to see what could make it lower cost. There are probably ways... Would it make sense to have RCU maintain a need-rcu_all_qs() flage in the same cacheline as the __preempt_count? Perhaps throttling the writes to this flag from the RCU grace-period kthreads to once per 100 milliseconds or so? Thanx, Paul