On Mon, Jan 08, 2007 at 08:06:35PM +0300, Oleg Nesterov wrote: > Ah, missed you point, thanks. Yet another old problem which was not introduced > by recent changes. And yet another indication we should avoid kthread_stop() > on CPU_DEAD event :) I believe this is easy to fix, but need to think more.
I think the problem is not just with CPU_DEAD. Anyone who calls cleanup_workqueue_thread (say destroy_workqueue?) will see this race. Do you see any problems if cleanup_workqueue_thread is changed as: cleanup_workqueue_thread() { kthread_stop(p); spin_lock(cwq->lock); cwq->thread = NULL; spin_unlock(cwq->lock); } > run_workqueue: > > while (!list_empty(&cwq->worklist)) { > ... > // We hold lock_cpu_hotplug(), cpu event can't make > // progress. > ... > } Ok ..yes a cpu_event_waits_for_lock() helper will help here. > > I agree it minimizes the interactions. Maybe worth attempting. However I > > suspect it may not be as simple as it appears :) > > Yes, that is why this patch only does the first step: flush_workqueue() checks > the dead CPUs as well, this change is minimal. > > Do you see any problems this patch adds? I dont see as of now. I suspect we will know better when we implement the patch to eliminate CPU_DEAD handling in workqueue.c -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/