On Mon, Jan 08, 2007 at 08:06:35PM +0300, Oleg Nesterov wrote:
> Ah, missed you point, thanks. Yet another old problem which was not introduced
> by recent changes. And yet another indication we should avoid kthread_stop()
> on CPU_DEAD event :) I believe this is easy to fix, but need to think more.

I think the problem is not just with CPU_DEAD. Anyone who calls
cleanup_workqueue_thread (say destroy_workqueue?) will see this race. 

Do you see any problems if cleanup_workqueue_thread is changed as:

cleanup_workqueue_thread()
{
        kthread_stop(p);
        spin_lock(cwq->lock);
        cwq->thread = NULL;
        spin_unlock(cwq->lock);
}


>       run_workqueue:
> 
>               while (!list_empty(&cwq->worklist)) {
>                       ...
>                       // We hold lock_cpu_hotplug(), cpu event can't make
>                       // progress.
>                       ...
>               }

Ok ..yes a cpu_event_waits_for_lock() helper will help here.

> > I agree it minimizes the interactions. Maybe worth attempting. However I
> > suspect it may not be as simple as it appears :)
> 
> Yes, that is why this patch only does the first step: flush_workqueue() checks
> the dead CPUs as well, this change is minimal.
> 
> Do you see any problems this patch adds?

I dont see as of now. I suspect we will know better when we implement
the patch to eliminate CPU_DEAD handling in workqueue.c

-- 
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to