On Wed, Oct 31, 2018 at 04:55:26PM -0500, Richard Henderson wrote:
> On 10/31/18 5:51 PM, Will Deacon wrote:
> > Aha, maybe this is the problem. An acquire fence on AArch64 is implemented
> > using a DMB LD instruction, which orders prior reads against subsequent
> > reads and writes. However, the architecture says:
> > 
> >   | The ST<OP> instructions, and LD<OP> instructions where the destination
> >   | register is WZR or XZR, are not regarded as doing a read for the purpose
> >   | of a DMB LD barrier.
> > 
> > and so therefore an ST atomic is not affected by a subsequent acquire fence,
> > whereas an LD atomic is.
> > 
> > Does that help at all?
> 
> Yes it does, thanks.  Lest this come up again, let's document this.

Good idea. OK.

James

> >From 804f690fc8ebaa436b97ea4c9fef830f9cd2b873 Mon Sep 17 00:00:00 2001
> From: Richard Henderson <richard.hender...@linaro.org>
> Date: Wed, 19 Sep 2018 22:18:09 +0000
> Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch
> 
>       * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
>       The scratch register need not be early-clobber.  Document the reason
>       why we cannot use ST<OP>.
> ---
>  gcc/config/aarch64/atomics.md | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
> index 2198649b1be..8944b5690b5 100644
> --- a/gcc/config/aarch64/atomics.md
> +++ b/gcc/config/aarch64/atomics.md
> @@ -263,6 +263,18 @@
>    }
>  )
>  
> +;; It is tempting to want to use ST<OP> here for relaxed and release
> +;; memory models here.  However, that is incompatible with the C++
> +;; memory model for the following case:
> +;;
> +;;   atomic_fetch_add(ptr, 1, memory_order_relaxed);
> +;;   atomic_thread_fence(memory_order_acquire);
> +;;
> +;; The problem is that the architecture says that ST<OP> (and LD<OP>
> +;; insns where the destination is XZR) are not regarded as a read.
> +;; However we also implement the acquire memory barrier with DMB LD,
> +;; and so the ST<OP> is not blocked by the barrier.
> +
>  (define_insn "aarch64_atomic_<atomic_ldoptab><mode>_lse"
>    [(set (match_operand:ALLI 0 "aarch64_sync_memory_operand" "+Q")
>       (unspec_volatile:ALLI
> @@ -270,7 +282,7 @@
>          (match_operand:ALLI 1 "register_operand" "r")
>          (match_operand:SI 2 "const_int_operand")]
>        ATOMIC_LDOP))
> -   (clobber (match_scratch:ALLI 3 "=&r"))]
> +   (clobber (match_scratch:ALLI 3 "=r"))]
>    "TARGET_LSE"
>    {
>     enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
> -- 
> 2.17.2
> 

Reply via email to