On Thu, 2008-01-31 at 00:42 +0100, Rafael J. Wysocki wrote: > Update. > > On Wednesday, 30 of January 2008, Rafael J. Wysocki wrote: > > Hi, > > > > Recently I've been observing problems with unmounting the /home fs on reboot > > and/or shutdown on two test boxes. > > > > After some more investigation I've found that this is due to some KDE > > processes > > stuck in D states after their owner has logged out. > > > > This happens 100% of the time if there's a suspend/resume cycle before the > > user > > logs out (ie. the user logs into KDE, works for some time, suspends the box > > to > > RAM and resmes one or more times and then logs out). Still, I also observe > > the > > symptoms on a box that's never suspended. > > > > I'm not sure how to debug this, so please advise. > > After reverting: > > commit 37bb6cb4097e29ffee970065b74499cbf10603a3 > Author: Peter Zijlstra <[EMAIL PROTECTED]> > Date: Fri Jan 25 21:08:32 2008 +0100 > > hrtimer: unlock hrtimer_wakeup > > I no longer get processes in the D state, but there still is a problem with > artswrapper (this is an openSUSE 10.3 system, x86-64). Namely, > after a suspend/resume cycle and logging out/logging in the user, > artswrapper gets stuck somewhere, apparently in the running (R) state. > For this reason it blocks any subsequent attempts to suspend. > > Here's the relevant trace (from show_state()): > > [ 522.474919] artswrapper R running task 0 4805 1 > [ 522.474922] ffff810074cd1f70 0000000000000082 0000000000000296 > ffff810074cd1ed8 > [ 522.474926] ffffffff80311769 ffff810074cd1f20 ffffffff80701240 > ffffffff80701240 > [ 522.474930] ffffffff80701240 ffffffff80701240 ffffffff80701240 > ffffffff80701240 > [ 522.474933] Call Trace: > [ 522.474940] [<ffffffff80311769>] ? __up_read+0x8f/0x97 > [ 522.474963] [<ffffffff8020c5cf>] retint_careful+0xd/0x21 > > where, according to gdb, > > (gdb) l *__up_read+0x8f > 0xffffffff80311769 is in __up_read > (/home/rafael/src/linux-2.6/lib/rwsem-spinlock.c:273). > 268 > 269 if (--sem->activity == 0 && !list_empty(&sem->wait_list)) > 270 sem = __rwsem_wake_one_writer(sem); > 271 > 272 spin_unlock_irqrestore(&sem->wait_lock, flags); > 273 } > 274 > 275 /* > 276 * release a write lock on the semaphore > 277 */ > > What gives?
Well, let me give you something that worked for Guillaume :-) --- Subject: hrtimer: fix hrtimer_init_sleeper() users commit 37bb6cb4097e29ffee970065b74499cbf10603a3 Author: Peter Zijlstra <[EMAIL PROTECTED]> Date: Fri Jan 25 21:08:32 2008 +0100 hrtimer: unlock hrtimer_wakeup Broke hrtimer_init_sleeper() users. It forgot to fix up the futex caller of this function to detect the failed queueing and messed up the do_nanosleep() caller in that it could leak a TASK_INTERRUPTIBLE state. Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> --- kernel/futex.c | 2 ++ kernel/hrtimer.c | 2 ++ 2 files changed, 4 insertions(+) Index: linux-2.6/kernel/hrtimer.c =================================================================== --- linux-2.6.orig/kernel/hrtimer.c +++ linux-2.6/kernel/hrtimer.c @@ -1312,6 +1312,8 @@ static int __sched do_nanosleep(struct h } while (t->task && !signal_pending(current)); + __set_current_state(TASK_RUNNING); + return t->task == NULL; } Index: linux-2.6/kernel/futex.c =================================================================== --- linux-2.6.orig/kernel/futex.c +++ linux-2.6/kernel/futex.c @@ -1252,6 +1252,8 @@ static int futex_wait(u32 __user *uaddr, t.timer.expires = *abs_time; hrtimer_start(&t.timer, t.timer.expires, HRTIMER_MODE_ABS); + if (!hrtimer_active(&t->timer)) + t->task = NULL; /* * the timer could have already expired, in which -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/