在 2025/11/10 下午6:34, Xi Ruoyao 写道:
On Mon, 2025-11-10 at 17:49 +0800, mengqinggang wrote:
-(define_insn "atomic_fetch_<amop><mode>"
+(define_expand "atomic_fetch_<amop><mode>"
+ [(match_operand:GPR 0 "register_operand") ;; old
value at mem
+ (any_atomic:GPR (match_operand:GPR 1 "memory_operand") ;; mem
location
+ (match_operand:GPR 2 "reg_or_0_operand")) ;; value
for op
+ (match_operand:SI 3 "const_int_operand")] ;; model
+ ""
+ {
+ if (TARGET_64BIT)
+ emit_insn (gen_la64_atomic_fetch_<amop><mode> (operands[0],
operands[1],
+ operands[2],
operands[3]));
+ else
+ emit_insn (gen_la32_atomic_fetch_<amop>si (operands[0],
operands[1],
+ operands[2],
operands[3]));
+ DONE;
+ })
The content in {...} should (or at least, can) be removed. It's effect
is just rewriting the content in [...] with the content in [...] of
la{64,32}_atomic_fetch_* (constraint ignored), but the contents are
already the same so it does nothing.
And IMO la32_atomic_fetch_<amop>si should be gated with !TARGET_64BIT to
avoid confusion.
(define_insn "atomic_fetch_nand_mask_inverted<mode>"
[(set (match_operand:GPR 0 "register_operand" "=&r")
(match_operand:GPR 1 "memory_operand" "+ZC"))
@@ -320,7 +395,7 @@ (define_insn
"atomic_fetch_nand_mask_inverted<mode>"
(match_operand:GPR 2 "register_operand" "r"))]
UNSPEC_SYNC_OLD_OP))
(clobber (match_scratch:GPR 3 "=&r"))]
- ""
+ "TARGET_64BIT || TARGET_32BIT_S"
{
return "1:\\n\\t"
"ll.<d>\\t%0,%1\\n\\t"
@@ -354,7 +429,23 @@ (define_expand "atomic_fetch_nand<mode>"
DONE;
})
Then atomic_fetch_nand<mode> needs to be adjusted for LA32R too or using
__atomic_fetch_nand on LA32R will cause an ICE.
We can just gate it with TARGET_64BIT || TARGET_32BIT_S but then
__atomic_fetch_nand will be expanded to a CAS loop (which nests a LL-SC
loop), so it will be better to write a custom LL-SC loop for
atomic_fetch_nand on LA32R.
This patch has some modify:
[PATCH v1 29/34] LoongArch: Add support for la32r ilp32s abi
(clobber (match_scratch:GPR 3 "=&r"))]
- "TARGET_64BIT || TARGET_32BIT_S"
+ ""
{
- return "1:\\n\\t"
- "ll.<d>\\t%0,%1\\n\\t"
- "orn\\t%3,%2,%0\\n\\t"
- "sc.<d>\\t%3,%1\\n\\t"
- "beqz\\t%3,1b";
+ if (TARGET_32BIT_R)
+ return "1:\\n\\t"
+ "ll.<d>\\t%0,%1\\n\\t"
+ "nor\\t%3,%0,$zero\\n\\t"
+ "or\\t%3,%2,%3\\n\\t"
+ "sc.<d>\\t%3,%1\\n\\t"
+ "beq\\t%3,$zero,1b";
+ else
+ return "1:\\n\\t"
+ "ll.<d>\\t%0,%1\\n\\t"
+ "orn\\t%3,%2,%0\\n\\t"
+ "sc.<d>\\t%3,%1\\n\\t"
+ "beq\\t%3,$zero,1b";
}
@@ -531,7 +643,7 @@ (define_expand "atomic_fetch_<amop><mode>"
(any_bitwise (match_operand:SHORT 1 "memory_operand" "+ZB") ;; memory
(match_operand:SHORT 2 "reg_or_0_operand" "rJ")) ;; val
(match_operand:SI 3 "const_int_operand" "")] ;; model
- ""
+ "TARGET_64BIT || TARGET_32BIT_S"
{
/* We have no QI/HImode bitwise atomics, so use the address LSBs to form
a mask, then use an aligned SImode atomic. */
Hmm, I cannot see why this sub-word atomic implementation won't work
with LA32R. Without it again sub-word atomic operations will be
expanded to a CAS loop, but a LL-SC loop (using
la32_atomic_fetch_<amop>si) with a mask is better.
IIUC atomic_fetch_<amop><mode> should either work OotB on LA32R or it
only needs some small adjustment.