>-----Original Message----- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Oleg Nesterov >Sent: Monday, January 08, 2007 9:07 AM >To: Srivatsa Vaddagiri >Cc: Andrew Morton; David Howells; Christoph Hellwig; Ingo >Molnar; Linus Torvalds; linux-kernel@vger.kernel.org; Gautham shenoy >Subject: Re: [PATCH] fix-flush_workqueue-vs-cpu_dead-race-update > >On 01/08, Srivatsa Vaddagiri wrote: >> >> On Mon, Jan 08, 2007 at 06:56:38PM +0300, Oleg Nesterov wrote: >> > > 2. >> > > >> > > CPU_DEAD->cleanup_workqueue_thread->(cwq->thread = >NULL)->kthread_stop() .. >> > > ^^^^^^^^^^^^^^^^^^^^ >> > > |___ Problematic >> > >> > Hmm... This should not be possible? cwq->thread != NULL on >CPU_DEAD event. >> >> sure, cwq->thread != NULL at CPU_DEAD event. However >> cleanup_workqueue_thread() will set it to NULL and block in >> kthread_stop(), waiting for the kthread to finish run_workqueue and >> exit. > >Ah, missed you point, thanks. Yet another old problem which >was not introduced >by recent changes. And yet another indication we should avoid >kthread_stop() >on CPU_DEAD event :) I believe this is easy to fix, but need >to think more.
The current code is workqueue-hptplug path is full of races. I stumbled upon atleast couple of different deadlock situations being discussed here with ondemand governor using workqueue and trying to flush during cpu hot remove. Specifically, a three way deadlock involving kthread_stop() with workqueue_mutex held and work itself blocked on some other mutex held by another task trying to flush the workqueue. One other approach I was thinking about, was to do all the hardwork in workqueue CPU_DOWN_PREPARE callback rather than in CPU_DEAD. We can call cleanup_workqueue_thread and take_over_work in DOWN_PREPARE, With that, I don't think we need to hold the workqueue_mutex across these two callbacks and eliminate the deadlocks related to flush_workqueue. Do you think this approach would simply things around here? Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/