On Thu, Sep 03, 2015 at 02:03:51AM +0200, Frederic Weisbecker wrote: > On Thu, Sep 03, 2015 at 12:24:27AM +0200, Peter Zijlstra wrote: > > On Wed, Sep 02, 2015 at 11:50:22PM +0200, Frederic Weisbecker wrote: > > > > > [ 875.703227] [<ffffffff810c2d74>] tick_nohz_full_kick_cpu+0x44/0x50 > > > > > > It happens in nohz full, but I'm not sure the guilty is nohz full. > > > > > > The problem here is that wake_up_nohz_cpu() selects a CPU that is offline. > > > > wake_up_nohz_cpu() doesn't do any such thing. Where does the selection > > logic live? > > Err, got confused with get_nohz_timer_target(). But yeah wake_up_nohz_cpu() is > called with a CPU that is chosen by mod_timer() -> get_nohz_timer_target(). > > > > > > But this shouldn't happen. Either it selects a CPU that is in the domain > > > tree, > > > and I suspect offline CPUs aren't supposed to be there, or it selects the > > > current > > > CPU. And if the CPU is offlined, it shouldn't be running some kthread... > > > > Do no assume things like that.. always check with the active mask. > > Hmm, so perhaps we need something like this (makes me realize that > the is_housekeeping_cpu() passes the wrong argument, no issue in practice > since nohz full aren't in the domain tree but I still need to fix that along). > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 0902e4d..2c10a69 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -628,7 +628,7 @@ int get_nohz_timer_target(void) > > rcu_read_lock(); > for_each_domain(cpu, sd) { > - for_each_cpu(i, sched_domain_span(sd)) { > + for_each_cpu_and(i, sched_domain_span(sd), cpu_online_mask) {
cpu_active_mask, we clear that when we start killing the cpu. online only gets cleared once the cpu is actually dead. > if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) { > cpu = i; > goto unlock; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/