On 18 February 2013 16:40, Frederic Weisbecker <fweis...@gmail.com> wrote: > 2013/2/18 Vincent Guittot <vincent.guit...@linaro.org>: >> On 18 February 2013 15:38, Frederic Weisbecker <fweis...@gmail.com> wrote: >>> I pasted the original at: http://pastebin.com/DMm5U8J8 >> >> We can clear the idle flag only in the nohz_kick_needed which will not >> be called if the sched_domain is NULL so the sequence will be >> >> = CPU 0 = = CPU 1= >> >> detach_and_destroy_domain { >> rcu_assign_pointer(cpu1_dom, NULL); >> } >> >> dom = new_domain(...) { >> nr_cpus_busy = 0; >> set_idle(CPU 1); >> } >> dom = >> rcu_dereference(cpu1_dom) >> //dom == NULL, return >> >> rcu_assign_pointer(cpu1_dom, dom); >> >> dom = >> rcu_dereference(cpu1_dom) >> //dom != NULL, >> nohz_kick_needed { >> >> set_idle(CPU 1) >> dom >> = rcu_dereference(cpu1_dom) >> >> //dec nr_cpus_busy, >> } >> >> Vincent > > Ok but CPU 0 can assign NULL to the domain of cpu1 while CPU 1 is > already in the middle of nohz_kick_needed().
Yes nothing prevents the sequence below to occur = CPU 0 = = CPU 1= dom = rcu_dereference(cpu1_dom) //dom != NULL detach_and_destroy_domain { rcu_assign_pointer(cpu1_dom, NULL); } dom = new_domain(...) { nr_cpus_busy = 0; //nr_cpus_busy in the new_dom set_idle(CPU 1); } nohz_kick_needed { clear_idle(CPU 1) dom = rcu_dereference(cpu1_dom) //cpu1_dom == old_dom inc nr_cpus_busy, //nr_cpus_busy in the old_dom } rcu_assign_pointer(cpu1_dom, dom); //cpu1_dom == new_dom I'm not sure that this can happen in practice because CPU1 is in interrupt handler but we don't have any mechanism to prevent the sequence. The NULL sched_domain can be used to detect this situation and the set_cpu_sd_state_busy function can be modified like below inline void set_cpu_sd_state_busy { struct sched_domain *sd; int cpu = smp_processor_id(); + int clear = 0; if (!test_bit(NOHZ_IDLE, nohz_flags(cpu))) return; - clear_bit(NOHZ_IDLE, nohz_flags(cpu)); rcu_read_lock(); for_each_domain(cpu, sd) { atomic_inc(&sd->groups->sgp->nr_busy_cpus); + clear = 1; } rcu_read_unlock(); + + if (likely(clear)) + clear_bit(NOHZ_IDLE, nohz_flags(cpu)); } The NOHZ_IDLE flag will not be clear if we have a NULL sched_domain attached to the CPU. With this implementation, we still don't need to get the sched_domain for testing the NOHZ_IDLE flag which occurs each time CPU becomes idle The patch 2 become useless Vincent _______________________________________________ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev