On 08/08, Linus Torvalds wrote: > > On Thu, Aug 8, 2013 at 12:17 PM, Oleg Nesterov <o...@redhat.com> wrote: > > > >> and as far as I can tell we have proper barriers for those (the > >> scheduler gets the rq lock > > > > Yes, but... ttwu() takse another lock, ->pi_lock to test ->state. > > The lock is different, but for task_state, the main thing we need to > worry abotu is memory ordering, not locks.
Yes sure. However, afaics in this particular case the locking does matter. Because: > The case of signals is special, in that the "wakeup criteria" is > inside the scheduler itself, but conceptually the rule is the same. yes, and because the waiter lacks mb(). IOW. The code like __set_current_state(STATE); if (!CONDITION) schedule(); is obviously racy, it doesn't have mb(). But the code like __set_current_state(TASK_INTERRUPTIBLE); schedule(); was always considered as correct, it relies on try_to_wake_up/schedule interaction. But after try_to_wake_up() was changed to use task->pi_lock this becomes racy in theory. Afaics. This __set_current_state(TASK_INTERRUPTIBLE) can leak into the critical section protected by rq->lock, it can be reordered with the CONDITION check, and in this case CONDITION == signal_pending(). No? > > we don't > > have mb() on the other side and schedule() can miss SIGPENDING? > > But we do have the mb, at least on x86. The "set_tsk_thread_flag()" is > a memory barrier there. Sorry for confusion, I meant "other side", see above. > But that's why I suggested adding a > smp_mb__after_clear_bit() to after setting the bit, Agreed. Or, once again, we can change try_to_wake_up() to do mb() rather then wmb(). And compared to the theoretical race above this looks more likely to me (although still unlikely). But probably we should start with another debugging patch, I'll send it in a minute. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/