On Wed, Oct 31, 2018 at 04:55:26PM -0500, Richard Henderson wrote: > On 10/31/18 5:51 PM, Will Deacon wrote: > > Aha, maybe this is the problem. An acquire fence on AArch64 is implemented > > using a DMB LD instruction, which orders prior reads against subsequent > > reads and writes. However, the architecture says: > > > > | The ST<OP> instructions, and LD<OP> instructions where the destination > > | register is WZR or XZR, are not regarded as doing a read for the purpose > > | of a DMB LD barrier. > > > > and so therefore an ST atomic is not affected by a subsequent acquire fence, > > whereas an LD atomic is. > > > > Does that help at all? > > Yes it does, thanks. Lest this come up again, let's document this.
Good idea. OK. James > >From 804f690fc8ebaa436b97ea4c9fef830f9cd2b873 Mon Sep 17 00:00:00 2001 > From: Richard Henderson <richard.hender...@linaro.org> > Date: Wed, 19 Sep 2018 22:18:09 +0000 > Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch > > * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): > The scratch register need not be early-clobber. Document the reason > why we cannot use ST<OP>. > --- > gcc/config/aarch64/atomics.md | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md > index 2198649b1be..8944b5690b5 100644 > --- a/gcc/config/aarch64/atomics.md > +++ b/gcc/config/aarch64/atomics.md > @@ -263,6 +263,18 @@ > } > ) > > +;; It is tempting to want to use ST<OP> here for relaxed and release > +;; memory models here. However, that is incompatible with the C++ > +;; memory model for the following case: > +;; > +;; atomic_fetch_add(ptr, 1, memory_order_relaxed); > +;; atomic_thread_fence(memory_order_acquire); > +;; > +;; The problem is that the architecture says that ST<OP> (and LD<OP> > +;; insns where the destination is XZR) are not regarded as a read. > +;; However we also implement the acquire memory barrier with DMB LD, > +;; and so the ST<OP> is not blocked by the barrier. > + > (define_insn "aarch64_atomic_<atomic_ldoptab><mode>_lse" > [(set (match_operand:ALLI 0 "aarch64_sync_memory_operand" "+Q") > (unspec_volatile:ALLI > @@ -270,7 +282,7 @@ > (match_operand:ALLI 1 "register_operand" "r") > (match_operand:SI 2 "const_int_operand")] > ATOMIC_LDOP)) > - (clobber (match_scratch:ALLI 3 "=&r"))] > + (clobber (match_scratch:ALLI 3 "=r"))] > "TARGET_LSE" > { > enum memmodel model = memmodel_from_int (INTVAL (operands[2])); > -- > 2.17.2 >