> > > If this solves the problem on your box then i'll do a proper fix and > > > introduce a cpu_relax_memory_change(*addr) type of API to around > > > monitor/mwait. This patch boots fine on my T60 - but i never saw > > > your problem. > > > > Yes, the patch does make the pauses go away. In fact just the > > smp_mb() seems to suffice. > > cool! Could you send me the smallest patch you tried that still made the > hangs go away?
My previous attempt was just commenting out parts of your patch. But maybe it's more logical to move the barrier to immediately after the unlock. With this patch I can't reproduce the problem, which may not mean very much, since it was rather a "fragile" bug anyway. But at least the fix looks pretty harmless. Thanks, Miklos Index: linux-2.6.22-rc4/kernel/sched.c =================================================================== --- linux-2.6.22-rc4.orig/kernel/sched.c 2007-06-18 08:59:17.000000000 +0200 +++ linux-2.6.22-rc4/kernel/sched.c 2007-06-18 09:04:13.000000000 +0200 @@ -1168,6 +1168,11 @@ repeat: /* If it's preempted, we yield. It could be a while. */ preempted = !task_running(rq, p); task_rq_unlock(rq, &flags); + /* + * Without this barrier, wait_task_inactive() can starve + * waiters of rq->lock (observed on Core2Duo) + */ + smp_mb(); cpu_relax(); if (preempted) yield(); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/