On Tue, Mar 17, 2015 at 01:45:25AM +0000, Mathieu Desnoyers wrote: > Let's go through a memory ordering scenario to highlight my reasoning > there. > > Let's consider the following memory barrier scenario performed in > user-space on an architecture with very relaxed ordering. PowerPC comes > to mind. > > https://lwn.net/Articles/573436/ > scenario 12: > > CPU 0 CPU 1 > CAO(x) = 1; r3 = CAO(y); > cmm_smp_wmb(); cmm_smp_rmb(); > CAO(y) = 1; r4 = CAO(x); > > BUG_ON(r3 == 1 && r4 == 0)
WTF is CAO() ? and that ridiculous cmm_ prefix on the barriers. > We tweak it to use sys_membarrier on CPU 1, and a simple compiler > barrier() on CPU 0: > > CPU 0 CPU 1 > CAO(x) = 1; r3 = CAO(y); > barrier(); sys_membarrier(); > CAO(y) = 1; r4 = CAO(x); > > BUG_ON(r3 == 1 && r4 == 0) That hardly seems like a valid substitution; barrier() is not a valid replacement of a memory barrier is it? Esp not on PPC. > Now if CPU 1 executes sys_membarrier while CPU 0 is preempted after both > stores, we have: > > CPU 0 CPU 1 > CAO(x) = 1; > [1st store is slow to > reach other cores] > CAO(y) = 1; > [2nd store reaches other > cores more quickly] > [preempted] > r3 = CAO(y) > (may see y = 1) > sys_membarrier() > Scheduler changes rq->curr. > skips CPU 0, because rq->curr has > been updated. > [return to userspace] > r4 = CAO(x) > (may see x = 0) > BUG_ON(r3 == 1 && r4 == 0) -> fails. > load_cr3, with implied > memory barrier, comes > after CPU 1 has read "x". > > The only way to make this scenario work is if a memory barrier is added > before updating rq->curr. (we could also do a similar scenario for the > needed barrier after store to rq->curr). Hmmm.. like that. Light begins to dawn. So I think in this case we're good with the smp_mb__before_spinlock() we have; but do note its not a full MB even though the name says so. Its basically: WMB + ACQUIRE, which theoretically can leak a read in, but nobody sane _delays_ reads, you want to speculate reads, not postpone. Also, it lacks the transitive property. > Would you see it as acceptable if we start by implementing > only the non-expedited sys_membarrier() ? Sure. > Then we can add > the expedited-private implementation after rq->curr becomes > available through RCU. Yeah, or not at all; I'm still trying to get Paul to remove the expedited nonsense from the kernel RCU bits; and now you want it in userspace too :/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/