On 01/16, Srivatsa Vaddagiri wrote: > > On Mon, Jan 15, 2007 at 07:55:16PM +0300, Oleg Nesterov wrote: > > > > As I said already __create_workqueue() needs a fix, schedule_on_each_cpu() > > is already broken, and should be fixed as well. > > __create_workqueue() creates worker threads for all online CPUs > currently. Accessing the online_map could be racy unless we > serialize the access with hotplug event (thr' a mutex like workqueue > mutex held between LOCK_ACQ/LOCK_RELEASE messages or process freezer) > OR take special measures as was done in flush_workqueue. How were > you considering to deal with that raciness?
This is easy to fix in multiple ways. For example (this is not the best approach), we can make cpu_creation_map. Like cpu_populate_map, but CPU is cleared on CPU_DEAD/CPU_UP_CANCELED. I'll continue when I have a time. > > > What abt stopping that thread in CPU_DOWN_PREPARE (before freezing > > > processes)? I understand that it may add to the latency, but compared to > > > the overall latency of process freezer, I suspect it may not be much. > > I meant issuing kthread_stop() in DOWN_PREPARE so that worker > thread exits itself (much before CPU is actually brought down). Deadlock if work_struct re-queues itself. > Even if there are problems with it, how abt something like below: > > workqueue_cpu_callback() > { > > CPU_DEAD: > /* threads are still frozen at this point */ > take_over_work(); No, we can't move a currently executing work. This will break flush_workxxx(). > kthread_mark_stop(worker_thread); > break; > > CPU_CLEAN_THREADS: > /* all threads resumed by now */ > kthread_stop(worker_thread); /* task_struct ref required? */ yes, required, > kthread_mark_stop() will mark somewhere in task_struct that the thread > should exit when it comes out of refrigerator. > > worker_thread() > { > > while (!kthread_should_stop()) { > if (cwq->freezeable) > try_to_freeze(); > > if (kthread_marked_stop(current)) > break; Racy. Because of kthread_stop() above we should clear cwq->thread somehow. But we can't do this: this workqueue may be already destroyed. > The advantage I see above is that, when take_over_work() is running, we wont > race with functions like flush_workqueue() (threads still frozen at that > point) and hence we avoid hacks like migrate_sequence. See above. Please note that the code I posted does something like kthread_mark_stop(), but it operates on cwq, not on task_struct, this makes a big difference. And it doesn't need take_over_work() at all. And it doesn't need additional complications. Instead, it lessens both the source and compiled code. Note that I am not against using freezer for cpu-hotplug. My point is that this can't help much for the current workqueue.c, and it will completely transparent with the change I suggest. Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/