On 07/15, Peter Zijlstra wrote: > > @@ -2211,13 +2211,15 @@ static void finish_task_switch(struct rq *rq, struct > task_struct *prev) > > /* > * A task struct has one reference for the use as "current". > + * > * If a task dies, then it sets TASK_DEAD in tsk->state and calls > - * schedule one last time. The schedule call will never return, and > - * the scheduled task must drop that reference. > - * The test for TASK_DEAD must occur while the runqueue locks are > - * still held, otherwise prev could be scheduled on another cpu, die > - * there before we look at prev->state, and then the reference would > - * be dropped twice. > + * schedule one last time. The schedule call will never return, and the > + * scheduled task must drop that reference. > + * > + * The test for TASK_DEAD must occur while the runqueue locks are still > + * held, otherwise we can race with RUNNING -> DEAD transitions, and > + * then the reference would be dropped twice. > + * > * Manfred Spraul <manf...@colorfullife.com> > */
Agreed, this looks much more understandable! And probably I missed something again, but it seems that this logic is broken with __ARCH_WANT_UNLOCKED_CTXSW. Of course, even if I am right this is pure theoretical, but smp_wmb() before "->on_cpu = 0" is not enough and we need a full barrier ? Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/