On 01/09, Srivatsa Vaddagiri wrote: > > On Tue, Jan 09, 2007 at 06:07:55PM +0300, Oleg Nesterov wrote: > > but at some point we should thaw processes, including cwq->thread which > > should die. > > I am presuming we will thaw processes after all CPU_DEAD handlers have > run. > > > So we are doing things like take_over_work() and this is the > > source of races, because the dead CPU is not on cpu_online_map. > > > > flush_workqueue() doesn't use any locks now. If we use freezer to implement > > cpu-hotplug nothing will change, we still have races. > > We have races -if- CPU_DEAD handling can run concurrently with a ongoing > flush_workqueue. From my recent understanding of process freezer, this > is not possible. In other words, flush_workqueue() can be its old > implementation as below w/o any races: > > some_thread: > > for_each_online_cpu(i) > flush_cpu_workqueue(i); > > As long as this loop is running, cpu_down/up will not proceed. This means, > cpu_online_map is stable even if flush_cpu_workqueue blocks .. > > Once this loop is complete and all threads have called try_to_freeze, > cpu_down will proceed to change the bit map and run CPU_DEAD handlers > of everyone. I am presuimg we will thaw processes only after all > CPU_DEAD/ONLINE handlers have run.
We can't do this. We should thaw cwq->thread (which was bound to the dead CPU) to complete CPU_DEAD event. So we still need some changes. Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/