On Wed, Nov 15, 2017 at 09:03:07PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 15, 2017 at 02:15:19PM -0500, Alan Stern wrote:
> > On Wed, 15 Nov 2017, Will Deacon wrote:
> > 
> > > On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote:
> > > > I was trying to think of something completely different.  If you have a
> > > > release/acquire to the same address, it creates a happens-before
> > > > ordering:
> > > > 
> > > >         Access x
> > > >         Release a
> > > >         Acquire a
> > > >         Access y
> > > > 
> > > > Here is the access to x happens-before the access to y.  This is true
> > > > even on x86, even in the presence of forwarding -- the CPU still has to
> > > > execute the instructions in order.  But if the release and acquire are
> > > > to different addresses:
> > > > 
> > > >         Access x
> > > >         Release a
> > > >         Acquire b
> > > >         Access y
> > > > 
> > > > then there is no happens-before ordering for x and y -- the CPU can
> > > > execute the last two instructions before the first two.  x86 and
> > > > PowerPC won't do this, but I believe ARMv8 can.  (Please correct me if
> > > > it can't.)
> > > 
> > > Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of
> > > address.
> > 
> > Ah, okay, thanks.
> > 
> > In any case, we have considered removing this ordering constraint
> > (store-release followed by load-acquire for the same location) from the
> > Linux-kernel memory model.
> 
> Why? Its a perfectly sensible construct.
> 
> > I'm not aware of any code in the kernel that depends on it.  Do any of
> > you happen to know of any examples?
> 
> All locks? Something like:
> 
>       spin_lock(&x)
>       /* foo */
>       spin_unlock(&x)
>       spin_lock(&x)
>       /* bar */
>       spin_unlock(&x);
> 
> Has a fairly high foo happens-before bar expectation level.
> 
> And in specific things like:
> 
>   135e8c9250dd5
>   ecf7d01c229d1
> 
> which use the release of rq->lock paired with the next acquire of the
> same rq->lock to match with an smp_rmb().

Those cycles are currently forbidden by LKMM _when_ you consider the
smp_mb__after_spinlock() from schedule().  See rfi-rel-acq-is-not-mb
from my previous email and Alan's remarks about cumul-fence.

  Andrea

Reply via email to