On 10/31/18 5:51 PM, Will Deacon wrote:
> Aha, maybe this is the problem. An acquire fence on AArch64 is implemented
> using a DMB LD instruction, which orders prior reads against subsequent
> reads and writes. However, the architecture says:
> 
>   | The ST<OP> instructions, and LD<OP> instructions where the destination
>   | register is WZR or XZR, are not regarded as doing a read for the purpose
>   | of a DMB LD barrier.
> 
> and so therefore an ST atomic is not affected by a subsequent acquire fence,
> whereas an LD atomic is.
> 
> Does that help at all?

Yes it does, thanks.  Lest this come up again, let's document this.


r~
>From 804f690fc8ebaa436b97ea4c9fef830f9cd2b873 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.hender...@linaro.org>
Date: Wed, 19 Sep 2018 22:18:09 +0000
Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch

	* config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
	The scratch register need not be early-clobber.  Document the reason
	why we cannot use ST<OP>.
---
 gcc/config/aarch64/atomics.md | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 2198649b1be..8944b5690b5 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -263,6 +263,18 @@
   }
 )
 
+;; It is tempting to want to use ST<OP> here for relaxed and release
+;; memory models here.  However, that is incompatible with the C++
+;; memory model for the following case:
+;;
+;;	atomic_fetch_add(ptr, 1, memory_order_relaxed);
+;;	atomic_thread_fence(memory_order_acquire);
+;;
+;; The problem is that the architecture says that ST<OP> (and LD<OP>
+;; insns where the destination is XZR) are not regarded as a read.
+;; However we also implement the acquire memory barrier with DMB LD,
+;; and so the ST<OP> is not blocked by the barrier.
+
 (define_insn "aarch64_atomic_<atomic_ldoptab><mode>_lse"
   [(set (match_operand:ALLI 0 "aarch64_sync_memory_operand" "+Q")
 	(unspec_volatile:ALLI
@@ -270,7 +282,7 @@
 	   (match_operand:ALLI 1 "register_operand" "r")
 	   (match_operand:SI 2 "const_int_operand")]
       ATOMIC_LDOP))
-   (clobber (match_scratch:ALLI 3 "=&r"))]
+   (clobber (match_scratch:ALLI 3 "=r"))]
   "TARGET_LSE"
   {
    enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
-- 
2.17.2

Reply via email to