+ linaro-toolchain as I don't understand the CI issues on patchwork. On Wed, Sep 27, 2023 at 8:40 PM Wilco Dijkstra <wilco.dijks...@arm.com> wrote: > > Hi Ramana, > > > Hope this helps. > > Yes definitely! > > >> Passes regress/bootstrap, OK for commit? > > > > Target ? armhf ? --with-arch , -with-fpu , -with-float parameters ? > > Please be specific. > > I used --target=arm-none-linux-gnueabihf --host=arm-none-linux-gnueabihf > --build=arm-none-linux-gnueabihf --with-float=hard. However it seems that the > default armhf settings are incorrect. I shouldn't need the --with-float=hard > since > that is obviously implied by armhf, and they should also imply armv7-a with > vfpv3 > according to documentation. It seems to get confused and skip some tests. I > tried > using --with-fpu=auto, but that doesn't work at all, so in the end I forced > it like: > --with-arch=armv8-a --with-fpu=neon-fp-armv8. With this it runs a few more > tests.
Yeah that's a wart that I don't like. armhf just implies the hard float ABI and came into being to help distinguish from the Base PCS for some of the distros at the time (2010s). However we didn't want to set a baseline arch at that time given the imminent arrival of v8-a and thus the specification of --with-arch , --with-fpu and --with-float became second nature to many of us working on it at that time. I can see how it can be confusing. With the advent of fpu=auto , I thought this could be made better but that is a matter for another patch and another discussion. However until then do remember to baseline to --with-arch , --with-fpu and --with-float . Unfortunate but needed. If there is a bug with --with-fpu=auto , good to articulate it as I understand that to be the state of the art. Thank you for working through it. > > > Since these patterns touch armv8m.baseline can you find all the > > testcases in the testsuite and ensure no change in code for > > armv8m.baseline as that's unpredicated already and this patch brings > > this in line with the same ? Does the testsuite already cover these > > arch variants and are you satisfied that the tests in the testsuite > > can catch / don't make any additional code changes to the other > > architectures affected by this ? > > There are various v8-m(.base/.main) tests and they all pass. The generated > code is generally unchanged if there was no conditional execution. I made > the new UNSPEC_LDR/STR patterns support offsets so there is no difference > in generated code for relaxed loads/stores (since they used to use a plain > load/store which has an immediate offset). > > >> * onfig/arm/sync.md (arm_atomic_load<mode>): Add new pattern. > > > > Nit: s/onfig/config > > Fixed. Thanks > > >> (atomic_load<mode>): Always expand atomic loads explicitly. > >> (atomic_store<mode>): Always expand atomic stores explicitly. > > > > Nit: Change message to : > > Switch patterns to define_expand. > > Fixed. Thanks. > > > Largely looks ok though I cannot work out tonight if we need more v8-a > > or v8m-baseline specific tests for scan-assembler patterns. > > > > Clearly our testsuite doesn't catch it , so perhaps the OP could help > > validate this patch with their formal models to see if this fixes > > these set of issues and creates no new regressions ? Is that feasible > > to do ? > > Disabling conditional execution avoids the issue. It's trivial to verify that > atomics can no longer be conditionally executed (no "%?"). When this is > committed, we can run the random testing again to confirm the issue > is no longer present. Ok, thanks for promising to do so - I trust you to get it done. Please try out various combinations of -march v7ve, v7-a , v8-a with the tool as each of them have slightly different rules. For instance v7ve allows LDREXD and STREXD to be single copy atomic for 64 bit loads whereas v7-a did not . > > > -(define_insn "atomic_load<mode>" > > - [(set (match_operand:QHSI 0 "register_operand" "=r,r,l") > > +(define_insn "arm_atomic_load<mode>" > > + [(set (match_operand:QHSI 0 "register_operand" "=r,l") > > (unspec_volatile:QHSI > > - [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q,Q,Q") > > - (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model > > + [(match_operand:QHSI 1 "memory_operand" "m,m")] > > > > Remind me again why is it safe to go from the Q constraint to the m > > constraint here and everywhere else you've done this ? > > That's because the relaxed loads/stores use LDR/STR wrapped in an > UNSPEC. To avoid regressions we have to use 'm' so that an immediate > offset can be merged into the memory access. This is because the instructions used here are ldr and str and the use of the 'm' constraint is considered safe. > > >> - VUNSPEC_LDA ; Represent a store-register-acquire. > >> + VUNSPEC_LDR ; Represent a load-register-relaxed. > >> + VUNSPEC_LDA ; Represent a load-register-acquire. > > > > Nit: LDA before LDR ? Though I suspect this list can be alphabetically > > ordered at another point of time. > > Swapped. Thanks > > > There are new tests added for v7-a , what happens with the output for > > v8-a and the changes for ldacq and other such instructions ? > > v7-a and v8-a generate the same instructions for relaxed load/store. > The acquire/release versions are identical except they are no longer > predicated. Basically the new patterns are not only significantly simpler, > they are now the same between the many ARM/Thumb-2/v7-a/v8-m/v8-a > combinations, so test coverage is much higher now. This is how these > patterns should have been designed all along. > > v2 follows below. > > Cheers, > Wilco > > > [PATCH v2] ARM: Block predication on atomics [PR111235] > > The v7 memory ordering model allows reordering of conditional atomic > instructions. To avoid this, make all atomic patterns unconditional. > Expand atomic loads and stores for all architectures so the memory access > can be wrapped into an UNSPEC. > > gcc/ChangeLog/ > PR target/111235 > * config/arm/constraints.md: Remove Pf constraint. > * config/arm/sync.md (arm_atomic_load<mode>): Add new pattern. > (arm_atomic_load_acquire<mode>): Likewise. > (arm_atomic_store<mode>): Likewise. > (arm_atomic_store_release<mode>): Likewise. > (atomic_load<mode>): Switch patterns to define_expand. > (atomic_store<mode>): Likewise. > (arm_atomic_loaddi2_ldrd): Remove predication. > (arm_load_exclusive<mode>): Likewise. > (arm_load_acquire_exclusive<mode>): Likewise. > (arm_load_exclusivesi): Likewise. > (arm_load_acquire_exclusivesi: Likewise. > (arm_load_exclusivedi): Likewise. > (arm_load_acquire_exclusivedi): Likewise. > (arm_store_exclusive<mode>): Likewise. > (arm_store_release_exclusivedi): Likewise. > (arm_store_release_exclusive<mode>): Likewise. > * gcc/config/arm/unspecs.md: Add VUNSPEC_LDR and VUNSPEC_STR. > > gcc/testsuite/ChangeLog/ > PR target/111235 > * gcc.dg/rtl/arm/stl-cond.c: Remove test. > * gcc.target/arm/atomic_loaddi_7.c: Fix dmb count. > * gcc.target/arm/atomic_loaddi_8.c: Likewise. > * gcc.target/arm/pr111235.c: Add new test. > > --- > > diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md > index > 05a4ebbdd67601d7b92aa44a619d17634cc69f17..d7c4a1b0cd785f276862048005e6cfa57cdcb20d > 100644 > --- a/gcc/config/arm/constraints.md > +++ b/gcc/config/arm/constraints.md > @@ -36,7 +36,7 @@ > ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe > ;; in Thumb-2 state: Ha, Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py, Pz, Rd, Rf, Rb, > Ra, > ;; Rg, Ri > -;; in all states: Pf, Pg > +;; in all states: Pg > > ;; The following memory constraints have been used: > ;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf, Ux, Ul > @@ -239,13 +239,6 @@ (define_constraint "Pe" > (and (match_code "const_int") > (match_test "TARGET_THUMB1 && ival >= 256 && ival <= 510"))) > > -(define_constraint "Pf" > - "Memory models except relaxed, consume or release ones." > - (and (match_code "const_int") > - (match_test "!is_mm_relaxed (memmodel_from_int (ival)) > - && !is_mm_consume (memmodel_from_int (ival)) > - && !is_mm_release (memmodel_from_int (ival))"))) > - > (define_constraint "Pg" > "@internal In Thumb-2 state a constant in range 1 to 32" > (and (match_code "const_int") > diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md > index > 7626bf3c443285dc63b4c4367b11a879a99c93c6..2210810f67f37ce043b8fdc73b4f21b54c5b1912 > 100644 > --- a/gcc/config/arm/sync.md > +++ b/gcc/config/arm/sync.md > @@ -62,68 +62,110 @@ (define_insn "*memory_barrier" > (set_attr "conds" "unconditional") > (set_attr "predicable" "no")]) > > -(define_insn "atomic_load<mode>" > - [(set (match_operand:QHSI 0 "register_operand" "=r,r,l") > +(define_insn "arm_atomic_load<mode>" > + [(set (match_operand:QHSI 0 "register_operand" "=r,l") > (unspec_volatile:QHSI > - [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q,Q,Q") > - (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model > + [(match_operand:QHSI 1 "memory_operand" "m,m")] > + VUNSPEC_LDR))] > + "" > + "ldr<sync_sfx>\t%0, %1" > + [(set_attr "arch" "32,any")]) > + > +(define_insn "arm_atomic_load_acquire<mode>" > + [(set (match_operand:QHSI 0 "register_operand" "=r") > + (unspec_volatile:QHSI > + [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q")] > VUNSPEC_LDA))] > "TARGET_HAVE_LDACQ" > - { > - if (aarch_mm_needs_acquire (operands[2])) > - { > - if (TARGET_THUMB1) > - return "lda<sync_sfx>\t%0, %1"; > - else > - return "lda<sync_sfx>%?\t%0, %1"; > - } > - else > - { > - if (TARGET_THUMB1) > - return "ldr<sync_sfx>\t%0, %1"; > - else > - return "ldr<sync_sfx>%?\t%0, %1"; > - } > - } > - [(set_attr "arch" "32,v8mb,any") > - (set_attr "predicable" "yes")]) > + "lda<sync_sfx>\t%0, %C1" > +) > > -(define_insn "atomic_store<mode>" > - [(set (match_operand:QHSI 0 "memory_operand" "=Q,Q,Q") > +(define_insn "arm_atomic_store<mode>" > + [(set (match_operand:QHSI 0 "memory_operand" "=m,m") > + (unspec_volatile:QHSI > + [(match_operand:QHSI 1 "register_operand" "r,l")] > + VUNSPEC_STR))] > + "" > + "str<sync_sfx>\t%1, %0"; > + [(set_attr "arch" "32,any")]) > + > +(define_insn "arm_atomic_store_release<mode>" > + [(set (match_operand:QHSI 0 "arm_sync_memory_operand" "=Q") > (unspec_volatile:QHSI > - [(match_operand:QHSI 1 "general_operand" "r,r,l") > - (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model > + [(match_operand:QHSI 1 "register_operand" "r")] > VUNSPEC_STL))] > "TARGET_HAVE_LDACQ" > - { > - if (aarch_mm_needs_release (operands[2])) > - { > - if (TARGET_THUMB1) > - return "stl<sync_sfx>\t%1, %0"; > - else > - return "stl<sync_sfx>%?\t%1, %0"; > - } > - else > - { > - if (TARGET_THUMB1) > - return "str<sync_sfx>\t%1, %0"; > - else > - return "str<sync_sfx>%?\t%1, %0"; > - } > - } > - [(set_attr "arch" "32,v8mb,any") > - (set_attr "predicable" "yes")]) > + "stl<sync_sfx>\t%1, %C0") > + > + > +(define_expand "atomic_load<mode>" > + [(match_operand:QHSI 0 "register_operand") ;; val out > + (match_operand:QHSI 1 "arm_sync_memory_operand") ;; memory > + (match_operand:SI 2 "const_int_operand")] ;; model > + "" > +{ > + memmodel model = memmodel_from_int (INTVAL (operands[2])); > + > + if (TARGET_HAVE_LDACQ && !is_mm_relaxed (model)) > + { > + emit_insn (gen_arm_atomic_load_acquire<mode> (operands[0], > operands[1])); > + DONE; > + } > + > + /* The seq_cst model needs a barrier before the load to block reordering > with > + earlier accesses. */ > + if (is_mm_seq_cst (model)) > + expand_mem_thread_fence (model); > + > + emit_insn (gen_arm_atomic_load<mode> (operands[0], operands[1])); > + > + /* All non-relaxed models need a barrier after the load when load-acquire > + instructions are not available. */ > + if (!is_mm_relaxed (model)) > + expand_mem_thread_fence (model); > + > + DONE; > +}) > + > +(define_expand "atomic_store<mode>" > + [(match_operand:QHSI 0 "arm_sync_memory_operand") ;; memory > + (match_operand:QHSI 1 "register_operand") ;; store value > + (match_operand:SI 2 "const_int_operand")] ;; model > + "" > +{ > + memmodel model = memmodel_from_int (INTVAL (operands[2])); > + > + if (TARGET_HAVE_LDACQ && !is_mm_relaxed (model)) > + { > + emit_insn (gen_arm_atomic_store_release<mode> (operands[0], > operands[1])); > + DONE; > + } > + > + /* All non-relaxed models need a barrier after the load when load-acquire > + instructions are not available. */ > + if (!is_mm_relaxed (model)) > + expand_mem_thread_fence (model); > + > + emit_insn (gen_arm_atomic_store<mode> (operands[0], operands[1])); > + > + /* The seq_cst model needs a barrier after the store to block reordering > with > + later accesses. */ > + if (is_mm_seq_cst (model)) > + expand_mem_thread_fence (model); > + > + DONE; > +}) > > ;; An LDRD instruction usable by the atomic_loaddi expander on LPAE targets > > (define_insn "arm_atomic_loaddi2_ldrd" > [(set (match_operand:DI 0 "register_operand" "=r") > (unspec_volatile:DI > - [(match_operand:DI 1 "arm_sync_memory_operand" "Q")] > + [(match_operand:DI 1 "memory_operand" "m")] > VUNSPEC_LDRD_ATOMIC))] > "ARM_DOUBLEWORD_ALIGN && TARGET_HAVE_LPAE" > - "ldrd%?\t%0, %H0, %C1" > - [(set_attr "predicable" "yes")]) > + "ldrd\t%0, %H0, %1" > +) > > ;; There are three ways to expand this depending on the architecture > ;; features available. As for the barriers, a load needs a barrier > @@ -152,6 +194,11 @@ (define_expand "atomic_loaddi" > DONE; > } > > + /* The seq_cst model needs a barrier before the load to block reordering > with > + earlier accesses. */ > + if (is_mm_seq_cst (model)) > + expand_mem_thread_fence (model); > + > /* On LPAE targets LDRD and STRD accesses to 64-bit aligned > locations are 64-bit single-copy atomic. We still need barriers in the > appropriate places to implement the ordering constraints. */ > @@ -160,7 +207,6 @@ (define_expand "atomic_loaddi" > else > emit_insn (gen_arm_load_exclusivedi (operands[0], operands[1])); > > - > /* All non-relaxed models need a barrier after the load when load-acquire > instructions are not available. */ > if (!is_mm_relaxed (model)) > @@ -446,54 +492,42 @@ (define_insn_and_split "atomic_nand_fetch<mode>" > [(set_attr "arch" "32,v8mb")]) > > (define_insn "arm_load_exclusive<mode>" > - [(set (match_operand:SI 0 "s_register_operand" "=r,r") > + [(set (match_operand:SI 0 "s_register_operand" "=r") > (zero_extend:SI > (unspec_volatile:NARROW > - [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")] > + [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")] > VUNSPEC_LL)))] > "TARGET_HAVE_LDREXBH" > - "@ > - ldrex<sync_sfx>%?\t%0, %C1 > - ldrex<sync_sfx>\t%0, %C1" > - [(set_attr "arch" "32,v8mb") > - (set_attr "predicable" "yes")]) > + "ldrex<sync_sfx>\t%0, %C1" > +) > > (define_insn "arm_load_acquire_exclusive<mode>" > - [(set (match_operand:SI 0 "s_register_operand" "=r,r") > + [(set (match_operand:SI 0 "s_register_operand" "=r") > (zero_extend:SI > (unspec_volatile:NARROW > - [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")] > + [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")] > VUNSPEC_LAX)))] > "TARGET_HAVE_LDACQ" > - "@ > - ldaex<sync_sfx>%?\\t%0, %C1 > - ldaex<sync_sfx>\\t%0, %C1" > - [(set_attr "arch" "32,v8mb") > - (set_attr "predicable" "yes")]) > + "ldaex<sync_sfx>\\t%0, %C1" > +) > > (define_insn "arm_load_exclusivesi" > - [(set (match_operand:SI 0 "s_register_operand" "=r,r") > + [(set (match_operand:SI 0 "s_register_operand" "=r") > (unspec_volatile:SI > - [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")] > + [(match_operand:SI 1 "mem_noofs_operand" "Ua")] > VUNSPEC_LL))] > "TARGET_HAVE_LDREX" > - "@ > - ldrex%?\t%0, %C1 > - ldrex\t%0, %C1" > - [(set_attr "arch" "32,v8mb") > - (set_attr "predicable" "yes")]) > + "ldrex\t%0, %C1" > +) > > (define_insn "arm_load_acquire_exclusivesi" > - [(set (match_operand:SI 0 "s_register_operand" "=r,r") > + [(set (match_operand:SI 0 "s_register_operand" "=r") > (unspec_volatile:SI > - [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")] > + [(match_operand:SI 1 "mem_noofs_operand" "Ua")] > VUNSPEC_LAX))] > "TARGET_HAVE_LDACQ" > - "@ > - ldaex%?\t%0, %C1 > - ldaex\t%0, %C1" > - [(set_attr "arch" "32,v8mb") > - (set_attr "predicable" "yes")]) > + "ldaex\t%0, %C1" > +) > > (define_insn "arm_load_exclusivedi" > [(set (match_operand:DI 0 "s_register_operand" "=r") > @@ -501,8 +535,8 @@ (define_insn "arm_load_exclusivedi" > [(match_operand:DI 1 "mem_noofs_operand" "Ua")] > VUNSPEC_LL))] > "TARGET_HAVE_LDREXD" > - "ldrexd%?\t%0, %H0, %C1" > - [(set_attr "predicable" "yes")]) > + "ldrexd\t%0, %H0, %C1" > +) > > (define_insn "arm_load_acquire_exclusivedi" > [(set (match_operand:DI 0 "s_register_operand" "=r") > @@ -510,8 +544,8 @@ (define_insn "arm_load_acquire_exclusivedi" > [(match_operand:DI 1 "mem_noofs_operand" "Ua")] > VUNSPEC_LAX))] > "TARGET_HAVE_LDACQEXD && ARM_DOUBLEWORD_ALIGN" > - "ldaexd%?\t%0, %H0, %C1" > - [(set_attr "predicable" "yes")]) > + "ldaexd\t%0, %H0, %C1" > +) > > (define_insn "arm_store_exclusive<mode>" > [(set (match_operand:SI 0 "s_register_operand" "=&r") > @@ -530,14 +564,11 @@ (define_insn "arm_store_exclusive<mode>" > Note that the 1st register always gets the > lowest word in memory. */ > gcc_assert ((REGNO (operands[2]) & 1) == 0 || TARGET_THUMB2); > - return "strexd%?\t%0, %2, %H2, %C1"; > + return "strexd\t%0, %2, %H2, %C1"; > } > - if (TARGET_THUMB1) > - return "strex<sync_sfx>\t%0, %2, %C1"; > - else > - return "strex<sync_sfx>%?\t%0, %2, %C1"; > + return "strex<sync_sfx>\t%0, %2, %C1"; > } > - [(set_attr "predicable" "yes")]) > +) > > (define_insn "arm_store_release_exclusivedi" > [(set (match_operand:SI 0 "s_register_operand" "=&r") > @@ -550,20 +581,16 @@ (define_insn "arm_store_release_exclusivedi" > { > /* See comment in arm_store_exclusive<mode> above. */ > gcc_assert ((REGNO (operands[2]) & 1) == 0 || TARGET_THUMB2); > - return "stlexd%?\t%0, %2, %H2, %C1"; > + return "stlexd\t%0, %2, %H2, %C1"; > } > - [(set_attr "predicable" "yes")]) > +) > > (define_insn "arm_store_release_exclusive<mode>" > - [(set (match_operand:SI 0 "s_register_operand" "=&r,&r") > + [(set (match_operand:SI 0 "s_register_operand" "=&r") > (unspec_volatile:SI [(const_int 0)] VUNSPEC_SLX)) > - (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua,Ua") > + (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua") > (unspec_volatile:QHSI > - [(match_operand:QHSI 2 "s_register_operand" "r,r")] > + [(match_operand:QHSI 2 "s_register_operand" "r")] > VUNSPEC_SLX))] > "TARGET_HAVE_LDACQ" > - "@ > - stlex<sync_sfx>%?\t%0, %2, %C1 > - stlex<sync_sfx>\t%0, %2, %C1" > - [(set_attr "arch" "32,v8mb") > - (set_attr "predicable" "yes")]) > + "stlex<sync_sfx>\t%0, %2, %C1") > diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md > index > 6a5b1f8f623b00dc014c9f829c591477f8faa9a1..4713ec840abae48ca70f418dbc0d4028ad4ad527 > 100644 > --- a/gcc/config/arm/unspecs.md > +++ b/gcc/config/arm/unspecs.md > @@ -221,8 +221,10 @@ (define_c_enum "unspecv" [ > VUNSPEC_SC ; Represent a store-register-exclusive. > VUNSPEC_LAX ; Represent a load-register-acquire-exclusive. > VUNSPEC_SLX ; Represent a store-register-release-exclusive. > - VUNSPEC_LDA ; Represent a store-register-acquire. > + VUNSPEC_LDA ; Represent a load-register-acquire. > + VUNSPEC_LDR ; Represent a load-register-relaxed. > VUNSPEC_STL ; Represent a store-register-release. > + VUNSPEC_STR ; Represent a store-register-relaxed. > VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content. > VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content. > VUNSPEC_SET_FPSCR_NZCVQC ; Represent assign of FPSCR_nzcvqc content. > diff --git a/gcc/testsuite/gcc.dg/rtl/arm/stl-cond.c > b/gcc/testsuite/gcc.dg/rtl/arm/stl-cond.c > deleted file mode 100644 > index > e47ca6bf36dac500b90d80e52e150a19c95c7219..0000000000000000000000000000000000000000 > --- a/gcc/testsuite/gcc.dg/rtl/arm/stl-cond.c > +++ /dev/null > @@ -1,61 +0,0 @@ > -/* { dg-do compile { target arm*-*-* } } */ > -/* { dg-require-effective-target arm_arm_ok } */ > -/* { dg-require-effective-target arm_arch_v8a_ok } */ > -/* { dg-options "-O2 -marm" } */ > -/* { dg-add-options arm_arch_v8a } */ > - > -/* We want to test that the STL instruction gets the conditional > - suffix when under a COND_EXEC. However, COND_EXEC is very hard to > - generate from C code because the atomic_store expansion adds a compiler > - barrier before the insn, preventing if-conversion. So test the output > - here with a hand-crafted COND_EXEC wrapped around an STL. */ > - > -void __RTL (startwith ("final")) foo (int *a, int b) > -{ > -(function "foo" > - (param "a" > - (DECL_RTL (reg/v:SI r0)) > - (DECL_RTL_INCOMING (reg:SI r0)) > - ) > - (param "b" > - (DECL_RTL (reg/v:SI r1)) > - (DECL_RTL_INCOMING (reg:SI r1)) > - ) > - (insn-chain > - (block 2 > - (edge-from entry (flags "FALLTHRU")) > - (cnote 5 [bb 2] NOTE_INSN_BASIC_BLOCK) > - > - (insn:TI 7 (parallel [ > - (set (reg:CC cc) > - (compare:CC (reg:SI r1) > - (const_int 0))) > - (set (reg/v:SI r1) > - (reg:SI r1 )) > - ]) ;; {*movsi_compare0} > - (nil)) > - > - ;; A conditional atomic store-release: STLNE for Armv8-A. > - (insn 10 (cond_exec (ne (reg:CC cc) > - (const_int 0)) > - (set (mem/v:SI (reg/v/f:SI r0) [-1 S4 A32]) > - (unspec_volatile:SI [ > - (reg/v:SI r1) > - (const_int 3) > - ] VUNSPEC_STL))) ;; {*p atomic_storesi} > - (expr_list:REG_DEAD (reg:CC cc) > - (expr_list:REG_DEAD (reg/v:SI r1) > - (expr_list:REG_DEAD (reg/v/f:SI r0) > - (nil))))) > - (edge-to exit (flags "FALLTHRU")) > - ) ;; block 2 > - ) ;; insn-chain > - (crtl > - (return_rtx > - (reg/i:SI r0) > - ) ;; return_rtx > - ) ;; crtl > -) ;; function > -} > - > -/* { dg-final { scan-assembler "stlne" } } */ > diff --git a/gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c > b/gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c > index > 6743663f1e6831d3fc441e80fcfe3d761987970b..79e36edbb8fa86712b8d283b9a81b794610f4813 > 100644 > --- a/gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c > +++ b/gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c > @@ -6,4 +6,4 @@ > #include "atomic_loaddi_seq_cst.x" > > /* { dg-final { scan-assembler-times "ldrexd\tr\[0-9\]+, r\[0-9\]+, > \\\[r\[0-9\]+\\\]" 1 } } */ > -/* { dg-final { scan-assembler-times "dmb\tish" 1 } } */ > +/* { dg-final { scan-assembler-times "dmb\tish" 2 } } */ > diff --git a/gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c > b/gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c > index > f7bd3e5a2b515d91418f5f4a7a8db00336a8b594..7241d361313997b0133056c1a3a7c1b546708e43 > 100644 > --- a/gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c > +++ b/gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c > @@ -6,4 +6,4 @@ > #include "atomic_loaddi_seq_cst.x" > > /* { dg-final { scan-assembler-times "ldrd\tr\[0-9\]+, r\[0-9\]+, > \\\[r\[0-9\]+\\\]" 1 } } */ > -/* { dg-final { scan-assembler-times "dmb\tish" 1 } } */ > +/* { dg-final { scan-assembler-times "dmb\tish" 2 } } */ > diff --git a/gcc/testsuite/gcc.target/arm/pr111235.c > b/gcc/testsuite/gcc.target/arm/pr111235.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..b06a5bfb8e294edf04ec9b7b6d8fe5d1ac73b1e7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/pr111235.c > @@ -0,0 +1,39 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +/* { dg-require-effective-target arm_arch_v7a_ok } */ > +/* { dg-add-options arm_arch_v7a } */ > + > +#include <stdatomic.h> > + > +int t0 (int *p, int x) > +{ > + if (x > 100) > + x = atomic_load_explicit (p, memory_order_relaxed); > + return x + 1; > +} > + > +long long t1 (long long *p, int x) > +{ > + if (x > 100) > + x = atomic_load_explicit (p, memory_order_relaxed); > + return x + 1; > +} > + > +void t2 (int *p, int x) > +{ > + if (x > 100) > + atomic_store_explicit (p, x, memory_order_relaxed); > +} > + > +void t3 (long long *p, int x) > +{ > + if (x > 100) > + atomic_store_explicit (p, x, memory_order_relaxed); > +} > + > +/* { dg-final { scan-assembler-times "ldrexd\tr\[0-9\]+, r\[0-9\]+, > \\\[r\[0-9\]+\\\]" 2 } } */ > +/* { dg-final { scan-assembler-not "ldrgt" } } */ > +/* { dg-final { scan-assembler-not "ldrdgt" } } */ > +/* { dg-final { scan-assembler-not "ldrexdgt" } } */ > +/* { dg-final { scan-assembler-not "strgt" } } */ > +/* { dg-final { scan-assembler-not "strdgt" } } */ Ok if no regressions but as you might get nagged by the post commit CI ... While it is not policy yet to look at these bots but given the enthusiasm at Cauldron for patchwork and pre-commit CI and because all my armhf boxes are boxed up, I decided to do something a bit novel ! I tried reviewing this via patchwork https://patchwork.sourceware.org/project/gcc/patch/pawpr08mb8982a6aa40749b74cad14c5783...@pawpr08mb8982.eurprd08.prod.outlook.com/ and notice that https://ci.linaro.org/job/tcwg_gcc_build--master-arm-precommit/2393/artifact/artifacts/artifacts.precommit/notify/mail-body.txt says nothing could be built. Possibly worth double checking the status for it being a false negative as to why the build failed. It was green on patchwork but remembering that Green is not Green for CI in patchwork I clicked on the afore mentioned ci.linaro.org link and see that it's actually broken. regards Ramana