On Tue, 2025-01-21 at 16:41 +0800, Lulu Cheng wrote: > > 在 2025/1/21 下午12:59, Xi Ruoyao 写道: > > On Tue, 2025-01-21 at 11:46 +0800, Lulu Cheng wrote: > > > 在 2025/1/18 下午7:33, Xi Ruoyao 写道: > > > /* snip */ > > > > ;; This code iterator allows unsigned and signed division to be > > > > generated > > > > ;; from the same template. > > > > @@ -3083,39 +3084,6 @@ (define_expand "rotl<mode>3" > > > > } > > > > }); > > > > > > > > -;; The following templates were added to generate "bstrpick.d + alsl.d" > > > > -;; instruction pairs. > > > > -;; It is required that the values of const_immalsl_operand and > > > > -;; immediate_operand must have the following correspondence: > > > > -;; > > > > -;; (immediate_operand >> const_immalsl_operand) == 0xffffffff > > > > - > > > > -(define_insn "zero_extend_ashift" > > > > - [(set (match_operand:DI 0 "register_operand" "=r") > > > > - (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r") > > > > - (match_operand 2 "const_immalsl_operand" "")) > > > > - (match_operand 3 "immediate_operand" "")))] > > > > - "TARGET_64BIT > > > > - && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)" > > > > - "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2" > > > > - [(set_attr "type" "arith") > > > > - (set_attr "mode" "DI") > > > > - (set_attr "insn_count" "2")]) > > > > - > > > > -(define_insn "bstrpick_alsl_paired" > > > > - [(set (match_operand:DI 0 "register_operand" "=&r") > > > > - (plus:DI > > > > - (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r") > > > > - (match_operand 2 "const_immalsl_operand" > > > > "")) > > > > - (match_operand 3 "immediate_operand" "")) > > > > - (match_operand:DI 4 "register_operand" "r")))] > > > > - "TARGET_64BIT > > > > - && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)" > > > > - "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,%4,%2" > > > > - [(set_attr "type" "arith") > > > > - (set_attr "mode" "DI") > > > > - (set_attr "insn_count" "2")]) > > > > - > > > Hi, > > > > > > In LoongArch, the microarchitecture has performed instruction fusion on > > > bstrpick.d+alsl.d. > > > > > > This modification may cause the two instructions to not be close together. > > > > > > So I think these two templates cannot be deleted. I will test the impact > > > of this patch on the spec today. > > Oops. I guess we can salvage it with TARGET_SCHED_MACRO_FUSION_P and > > TARGET_SCHED_MACRO_FUSION_PAIR_P. And I'd like to know more details: > > > > 1. Is the fusion applying to all bstrpick.d + alsl.d, or only bstrpick.d > > rd, rs, 31, 0? > > 2. Is the fusion also applying to bstrpick.d + slli.d, or we really have > > to write the strange "alsl.d rd, rs, r0, shamt" instruction? > > > Currently, command fusion can only be done in the following situations: > > bstrpick.d rd, rs, 31, 0 + alsl.d rd1,rj,rk,shamt and "rd = rj"
So the easiest solution seems just adding the two patterns back, I'm bootstrapping and regtesting the patch attached. -- Xi Ruoyao <xry...@xry111.site> School of Aerospace Science and Technology, Xidian University
From 88dc215ee55c0e9da05812310964d41c69117416 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao <xry...@xry111.site> Date: Tue, 21 Jan 2025 17:34:36 +0800 Subject: [PATCH 1/2] LoongArch: Add back zero_extend_ashift and bstrpick_alsl_paired This partially reverts r15-7062-g10e98638998. These two define_insn's are needed for utilizing the macro-fusion of bstrpick.d rd,rs,31,0 and alsl.d rd,rd,rk,shamt. Per GCC Internal section "When the Order of Patterns Matters," having zero_extend_ashift and bstrpick_alsl_paired before and_shift_reversedi is enough to make the compiler prefer zero_extend_ashift or bstrpick_alsl_paired, so we don't need to explicitly reject the case in and_shift_reversedi. The test change is also reverted and now the test properly demonstrate bstrpick_alsl_paired should be used. gcc/ChangeLog: * config/loongarch/loongarch.md (zero_extend_ashift): New define_insn. (bstrpick_alsl_paired): New define_insn. gcc/testsuite/ChangeLog: * gcc.target/loongarch/bstrpick_alsl_paired.c: Revert r15-7062 change. --- gcc/config/loongarch/loongarch.md | 33 +++++++++++++++++++ .../loongarch/bstrpick_alsl_paired.c | 2 +- 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 9cde5c58a20..01145fa0f70 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -3080,6 +3080,39 @@ (define_expand "rotl<mode>3" } }); +;; The following templates were added to generate "bstrpick.d + alsl.d" +;; instruction pairs. +;; It is required that the values of const_immalsl_operand and +;; immediate_operand must have the following correspondence: +;; +;; (immediate_operand >> const_immalsl_operand) == 0xffffffff + +(define_insn "zero_extend_ashift" + [(set (match_operand:DI 0 "register_operand" "=r") + (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r") + (match_operand 2 "const_immalsl_operand" "")) + (match_operand 3 "immediate_operand" "")))] + "TARGET_64BIT + && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)" + "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2" + [(set_attr "type" "arith") + (set_attr "mode" "DI") + (set_attr "insn_count" "2")]) + +(define_insn "bstrpick_alsl_paired" + [(set (match_operand:DI 0 "register_operand" "=&r") + (plus:DI + (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r") + (match_operand 2 "const_immalsl_operand" "")) + (match_operand 3 "immediate_operand" "")) + (match_operand:DI 4 "register_operand" "r")))] + "TARGET_64BIT + && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)" + "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,%4,%2" + [(set_attr "type" "arith") + (set_attr "mode" "DI") + (set_attr "insn_count" "2")]) + (define_insn "alsl<mode>3" [(set (match_operand:GPR 0 "register_operand" "=r") (plus:GPR (ashift:GPR (match_operand:GPR 1 "register_operand" "r") diff --git a/gcc/testsuite/gcc.target/loongarch/bstrpick_alsl_paired.c b/gcc/testsuite/gcc.target/loongarch/bstrpick_alsl_paired.c index 900e8c9e19f..0bca3886c32 100644 --- a/gcc/testsuite/gcc.target/loongarch/bstrpick_alsl_paired.c +++ b/gcc/testsuite/gcc.target/loongarch/bstrpick_alsl_paired.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-mabi=lp64d -O2 -fdump-rtl-combine" } */ -/* { dg-final { scan-rtl-dump "{and_shift_reversedi}" "combine" } } */ +/* { dg-final { scan-rtl-dump "{bstrpick_alsl_paired}" "combine" } } */ /* { dg-final { scan-assembler-not "alsl.d\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,\\\$r0" } } */ struct SA -- 2.48.1
From 2a6e53ad2e1e103cb79f2405a5d76a3e31ce4acb Mon Sep 17 00:00:00 2001 From: Xi Ruoyao <xry...@xry111.site> Date: Tue, 21 Jan 2025 17:51:52 +0800 Subject: [PATCH 2/2] LoongArch: (NFC) Update the comment for zero_extend_ashift and bstrpick_alsl_paired gcc/ChangeLog: * config/loongarch/loongarch.md (zero_extend_ashift, bstrpick_alsl_paired): Update the comment to make it explicit we want to fuse the pairs. --- gcc/config/loongarch/loongarch.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 01145fa0f70..588015eb91b 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -3080,12 +3080,10 @@ (define_expand "rotl<mode>3" } }); -;; The following templates were added to generate "bstrpick.d + alsl.d" -;; instruction pairs. -;; It is required that the values of const_immalsl_operand and -;; immediate_operand must have the following correspondence: -;; -;; (immediate_operand >> const_immalsl_operand) == 0xffffffff +;; The following templates were added to generate +;; "bstrpick.d rd,rs1,31,0 + alsl.d rd,rd,rs2,shamt" instruction pairs. +;; These pairs are fused so we shouldn't split them, and even if rs2 is +;; r0 we shouldn't change the second instruction to slli.d. (define_insn "zero_extend_ashift" [(set (match_operand:DI 0 "register_operand" "=r") -- 2.48.1