On 8/13/24 8:23 PM, Li, Pan2 wrote:
This Patch may requires rebase, will send v3 for conflict resolving.
Pan
-----Original Message-----
From: Li, Pan2 <pan2...@intel.com>
Sent: Sunday, August 4, 2024 7:48 PM
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com;
rdapp....@gmail.com; Li, Pan2 <pan2...@intel.com>
Subject: [PATCH v2] RISC-V: Support IMM for operand 0 of ussub pattern
From: Pan Li <pan2...@intel.com>
This patch would like to allow IMM for the operand 0 of ussub pattern.
Aka .SAT_SUB(1023, y) as the below example.
Form 1:
#define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
T __attribute__((noinline)) \
sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \
{ \
return (T)IMM >= y ? (T)IMM - y : 0; \
}
DEF_SAT_U_SUB_IMM_FMT_1(uint64_t, 1023)
Before this patch:
10 │ sat_u_sub_imm82_uint64_t_fmt_1:
11 │ li a5,82
12 │ bgtu a0,a5,.L3
13 │ sub a0,a5,a0
14 │ ret
15 │ .L3:
16 │ li a0,0
17 │ ret
After this patch:
10 │ sat_u_sub_imm82_uint64_t_fmt_1:
11 │ li a5,82
12 │ sltu a4,a5,a0
13 │ addi a4,a4,-1
14 │ sub a0,a5,a0
15 │ and a0,a4,a0
16 │ ret
The below test suites are passed for this patch:
1. The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_gen_unsigned_xmode_reg): Add new
func impl to gen xmode rtx reg from operand rtx.
(riscv_expand_ussub): Gen xmode reg for operand 1.
* config/riscv/riscv.md: Allow const_int for operand 1.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macro.
* gcc.target/riscv/sat_u_sub_imm-1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-1_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-1_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-4.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-4.c: New test.
Signed-off-by: Pan Li <pan2...@intel.com>
---
gcc/config/riscv/riscv.cc | 51 ++++++++++++++++-
gcc/config/riscv/riscv.md | 2 +-
gcc/testsuite/gcc.target/riscv/sat_arith.h | 10 ++++
.../gcc.target/riscv/sat_u_sub_imm-1.c | 20 +++++++
.../gcc.target/riscv/sat_u_sub_imm-1_1.c | 20 +++++++
.../gcc.target/riscv/sat_u_sub_imm-1_2.c | 20 +++++++
.../gcc.target/riscv/sat_u_sub_imm-2.c | 21 +++++++
.../gcc.target/riscv/sat_u_sub_imm-2_1.c | 21 +++++++
.../gcc.target/riscv/sat_u_sub_imm-2_2.c | 22 ++++++++
.../gcc.target/riscv/sat_u_sub_imm-3.c | 20 +++++++
.../gcc.target/riscv/sat_u_sub_imm-3_1.c | 21 +++++++
.../gcc.target/riscv/sat_u_sub_imm-3_2.c | 22 ++++++++
.../gcc.target/riscv/sat_u_sub_imm-4.c | 19 +++++++
.../gcc.target/riscv/sat_u_sub_imm-run-1.c | 56 +++++++++++++++++++
.../gcc.target/riscv/sat_u_sub_imm-run-2.c | 56 +++++++++++++++++++
.../gcc.target/riscv/sat_u_sub_imm-run-3.c | 55 ++++++++++++++++++
.../gcc.target/riscv/sat_u_sub_imm-run-4.c | 48 ++++++++++++++++
17 files changed, 482 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-1_1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-1_2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-2_1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-2_2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-3_1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-3_2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-run-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-run-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-run-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_sub_imm-run-4.c
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index b19d56149e7..5e4e9722729 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -11612,6 +11612,55 @@ riscv_expand_usadd (rtx dest, rtx x, rtx y)
emit_move_insn (dest, gen_lowpart (mode, xmode_dest));
}
+/* Generate a REG rtx of Xmode from the given rtx and mode.
+ The rtx x can be REG (QI/HI/SI/DI) or const_int.
+ The machine_mode mode is the original mode from define pattern.
+
+ If rtx is REG, the gen_lowpart of Xmode will be returned.
+
+ If rtx is const_int, a new REG rtx will be created to hold the value of
+ const_int and then returned.
+
+ According to the gccint doc, the constants generated for modes with fewer
+ bits than in HOST_WIDE_INT must be sign extended to full width. Thus there
+ will be two cases here, take QImode as example.
+
+ For .SAT_SUB (127, y) in QImode, we have (const_int 127) and one simple
+ mov from const_int to the new REG rtx is good enough here.
+
+ For .SAT_SUB (254, y) in QImode, we have (const_int -2) after define_expand.
+ Aka 0xfffffffffffffffe in Xmode of RV64 but we actually need 0xfe in Xmode
+ of RV64. So we need to cleanup the highest 56 bits of the new REG rtx moved
+ from the (const_int -2).
+
+ Then the underlying expanding can perform the code generation based on
+ the REG rtx of Xmode, instead of taking care of these in expand func. */
+
+static rtx
+riscv_gen_unsigned_xmode_reg (rtx x, machine_mode mode)
+{
+ if (!CONST_INT_P (x))
+ return gen_lowpart (Xmode, x);
+
+ rtx xmode_x = gen_reg_rtx (Xmode);
+ HOST_WIDE_INT cst = INTVAL (x);
+
+ emit_move_insn (xmode_x, x);
+
+ int xmode_bits = GET_MODE_BITSIZE (Xmode);
+ int mode_bits = GET_MODE_BITSIZE (mode).to_constant ();
+
+ if (cst < 0 && mode_bits < xmode_bits)
+ {
+ int shift_bits = xmode_bits - mode_bits;
+
+ riscv_emit_binary (ASHIFT, xmode_x, xmode_x, GEN_INT (shift_bits));
+ riscv_emit_binary (LSHIFTRT, xmode_x, xmode_x, GEN_INT (shift_bits));
+ }
Isn't this a zero_extension?