On Thu, Oct 18, 2007 at 08:26:45PM -0700, Linus Torvalds wrote: > > > On Thu, 18 Oct 2007, Linus Torvalds wrote: > > > > I *think* it should work with something like > > > > for (;;) { > > smp_rmb(); > > if (!spin_is_locked(&desc->lock)) { > > smp_rmb(); > > if (!(desc->status & IRQ_INPROGRESS) > > break; > > } > > cpu_relax(); > > } > > I'm starting to doubt this. > > One of the issues is that we still need the smp_mb() in front of the loop > (because we want to serialize the loop with any writes in the caller).
Right. I think if we accept the definition of a spin lock/unlock that Nick outlined earlier, then we can see that on the IRQ path there simply isn't a memory barrier between the changing of the status field and the execution of the action: spin_lock set IRQ_INPROGRESS spin_unlock action spin_lock clear IRQ_INPROGRESS spin_unlock What may happen is that action can either float upwards to give spin_lock action set IRQ_INPROGRESS spin_unlock spin_lock clear IRQ_INPROGRESS spin_unlock or it can float downwards to give spin_lock set IRQ_INPROGRESS spin_unlock spin_lock clear IRQ_INPROGRESS action spin_unlock As a result we can add as many barriers as we want on the slow (synchronize_irq) path and it just won't do any good (not unless we add some barriers on the IRQ path == fast path). What we do have on the right though is the fact in some ways action behaves as if it's inside the spin lock even though it's not. In particular, action cannot float up past the first spin lock nor can it float down past the last spin unlock. > See commit fa490cfd15d7ce0900097cc4e60cfd7a76381138 and ponder. Maybe we > should take the same approach here, and do something like > > repeat: > /* Optimistic, no-locking loop */ > while (desc->status & IRQ_INPROGRESS) > cpu_relax(); > > /* Ok, that indicated we're done: double-check carefully */ > spin_lock_irqsave(&desc->lock, flags); > status = desc->status; > spin_unlock_irqrestore(&desc->lock, flags); > > /* Oops, that failed? */ > if (status & IRQ_INPROGRESS) > goto repeat; That's why I think this patch is in fact the only one that solves all the races in this thread. The case that it solves which the lock/unlock patch does not is the one where action flows downwards past the clearing of IRQ_INPROGRESS. I missed this case earlier. In fact this bug exists elsewhere too. For example, the network stack does this in net/sched/sch_generic.c: /* Wait for outstanding qdisc_run calls. */ while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state)) yield(); This has the same problem as the current synchronize_irq code. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/