On Thu, Jun 21, 2007 at 10:39:31AM +0200, Ingo Molnar wrote: > > * Jarek Poplawski <[EMAIL PROTECTED]> wrote: > > > BTW, I've looked a bit at these NMI watchdog traces, and now I'm not > > even sure it's necessarily the spinlock's problem (but I don't exclude > > this possibility yet). It seems both processors use task_rq_lock(), so > > there could be also a problem with that loop. The way the correctness > > of the taken lock is verified is racy: there is a small probability > > that if we have taken the wrong lock the check inside the loop is done > > just before the value is beeing changed elsewhere under the right > > lock. Another possible problem could be a result of some wrong > > optimization or wrong propagation of change of this task_rq(p) value. > > ok, could you elaborate this in a bit more detail? You say it's racy - > any correctness bug in task_rq_lock() will cause the kernel to blow up > in spectacular ways. It's a fairly straightforward loop: > > static inline struct rq *__task_rq_lock(struct task_struct *p) > __acquires(rq->lock) > { > struct rq *rq; > > repeat_lock_task: > rq = task_rq(p); > spin_lock(&rq->lock); > if (unlikely(rq != task_rq(p))) { > spin_unlock(&rq->lock); > goto repeat_lock_task; > } > return rq; > } > > the result of task_rq() depends on p->thread_info->cpu wich will only > change if a task has migrated over to another CPU. That is a > fundamentally 'slow' operation, but even if a task does it intentionally > in a high frequency way (for example via repeated calls to > sched_setaffinity) there's no way it could be faster than the spinlock > code here. So ... what problems can you see with it?
OK, you are right - I withdraw this "idea". Sorry! Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/