[PATCH v2] RISC-V: Support {U}INT64 to FP16 auto-vectorization
From: Pan Li Update in v2: * Add math trap check. * Adjust some test cases. Original logs: This patch would like to support the auto-vectorization from the INT64 to FP16. We take below steps for the conversion. * INT64 to FP32. * FP32 to FP16. Given sample code as below: void test_func (int64_t * __restrict a, _Float16 *b, unsigned n) { for (unsigned i = 0; i < n; i++) b[i] = (_Float16) (a[i]); } Before this patch: test.c:6:26: missed: couldn't vectorize loop test.c:6:26: missed: not vectorized: unsupported data-type ld a0,0(s0) call__floatdihf fsh fa0,0(s1) addis0,s0,8 addis1,s1,2 bne s2,s0,.L3 ld ra,24(sp) ld s0,16(sp) ld s1,8(sp) ld s2,0(sp) addisp,sp,32 After this patch: vsetvli a5,a2,e8,mf8,ta,ma vle64.v v1,0(a0) vsetvli a4,zero,e32,mf2,ta,ma vfncvt.f.x.wv1,v1 vsetvli zero,zero,e16,mf4,ta,ma vfncvt.f.f.wv1,v1 vsetvli zero,a2,e16,mf4,ta,ma vse16.v v1,0(a1) Please note VLS mode is also involved in this patch and covered by the test cases. PR target/111506 gcc/ChangeLog: * config/riscv/autovec.md (2): New pattern. * config/riscv/vector-iterators.md: New iterator. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/cvt-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/cvt-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/cvt-0.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 24 ++ gcc/config/riscv/vector-iterators.md | 38 +++ .../gcc.target/riscv/rvv/autovec/unop/cvt-0.c | 21 + .../gcc.target/riscv/rvv/autovec/unop/cvt-1.c | 22 + .../gcc.target/riscv/rvv/autovec/vls/cvt-0.c | 47 +++ 5 files changed, 152 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/cvt-0.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index cd0cbdd2889..d6cf376ebca 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -974,6 +974,30 @@ (define_insn_and_split "2" } [(set_attr "type" "vfncvtitof")]) +;; This operation can be performed in the loop vectorizer but unfortunately +;; not applicable for now. We can remove this pattern after loop vectorizer +;; is able to take care of INT64 to FP16 conversion. +(define_insn_and_split "2" + [(set (match_operand: 0 "register_operand") + (any_float: + (match_operand:VWWCONVERTI 1 "register_operand")))] + "TARGET_VECTOR && TARGET_ZVFH && can_create_pseudo_p () && !flag_trapping_math" + "#" + "&& 1" + [(const_int 0)] + { +rtx single = gen_reg_rtx (mode); /* Get vector SF mode. */ + +/* Step-1, INT64 => FP32. */ +emit_insn (gen_2 (single, operands[1])); +/* Step-2, FP32 => FP16. */ +emit_insn (gen_trunc2 (operands[0], single)); + +DONE; + } + [(set_attr "type" "vfncvtitof")] +) + ;; = ;; == Unary arithmetic ;; = diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index b6cd872eb42..c9a7344b1bc 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -1247,6 +1247,24 @@ (define_mode_iterator VWCONVERTI [ (V512DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 4096") ]) +(define_mode_iterator VWWCONVERTI [ + (RVVM8DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH") + (RVVM4DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH") + (RVVM2DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH") + (RVVM1DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH") + + (V1DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH") + (V2DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH") + (V4DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH") + (V8DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && TARGET_MIN_VLEN >= 64") + (V16DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && TARGET_MIN_VLEN >= 128") + (V32DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && TARGET_MIN_VLEN >= 256") + (V64DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && TARGET_MIN_VLEN >= 512") + (V128DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && TARGET_MIN_VLEN >= 1024") + (V256DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && TARGET_MIN_VLEN >= 2048") + (V512DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && TARGET_MIN_VLEN >= 4096") +]) + (define_mode_iterator VQEXTI [ RVVM8SI RVVM4SI
[PATCH v1] RISC-V: Update comments for FP rounding related autovec
From: Pan Li Some comment is out of date, this patch would like to fix it. gcc/ChangeLog: * config/riscv/autovec.md: Update comments. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 056f2c352f6..53e9d34eea1 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2229,12 +2229,16 @@ (define_expand "avg3_ceil" }) ;; - -;; [FP] Math.h. +;; [FP] Rounding. ;; - ;; Includes: ;; - ceil/ceilf ;; - floor/floorf ;; - nearbyint/nearbyintf +;; - rint/rintf +;; - round/roundf +;; - trunc/truncf +;; - roundeven/roundevenf ;; - (define_expand "ceil2" [(match_operand:V_VLSF 0 "register_operand") -- 2.34.1
[PATCH v1] RISC-V: Bugfix for legitimize address PR/111634
From: Pan Li Given we have RTL as below. (plus:DI (mult:DI (reg:DI 138 [ g.4_6 ]) (const_int 8 [0x8])) (lo_sum:DI (reg:DI 167) (symbol_ref:DI ("f") [flags 0x86] ) )) When handling (plus (plus (mult (a) (mem_shadd_constant)) (fp)) (C)) case, the fp will be the lo_sum operand as above. We have assumption that the fp is reg but actually not here. It will have ICE when building with option --enable-checking=rtl. This patch would like to fix it by adding the REG_P to ensure the operand is a register. The test case gcc/testsuite/gcc.dg/pr109417.c covered this fix when build with --enable-checking=rtl. PR target/111634 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_address): Bugfix. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index d5446b63dbf..2b839241f1a 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -2042,7 +2042,7 @@ riscv_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED, { rtx index = XEXP (base, 0); rtx fp = XEXP (base, 1); - if (REGNO (fp) == VIRTUAL_STACK_VARS_REGNUM) + if (REG_P (fp) && REGNO (fp) == VIRTUAL_STACK_VARS_REGNUM) { /* If we were given a MULT, we must fix the constant -- 2.34.1
[PATCH v1] RISC-V: Add more run test for FP rounding autovec
From: Pan Li For _Float16 types, add run test for: * ceil * floor * nearbyint * rint * round * roundeven * trunc For float and double, add run test for: * roundeven The zfa extension is required for these run test cases, the simulation target_board may look like below for rv64. target_board="riscv-sim/-march=rv64gcv_zfa_zfh/-mabi=lp64d/-mcmodel=medlow" gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.exp: Add zfa for building. * gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-rint-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-round-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-2.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-trunc-run-0.c: New test. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-ceil-run-0.c | 39 +++ .../riscv/rvv/autovec/unop/math-floor-run-0.c | 39 +++ .../rvv/autovec/unop/math-nearbyint-run-0.c | 48 +++ .../riscv/rvv/autovec/unop/math-rint-run-0.c | 48 +++ .../riscv/rvv/autovec/unop/math-round-run-0.c | 39 +++ .../rvv/autovec/unop/math-roundeven-run-0.c | 39 +++ .../rvv/autovec/unop/math-roundeven-run-1.c | 39 +++ .../rvv/autovec/unop/math-roundeven-run-2.c | 39 +++ .../riscv/rvv/autovec/unop/math-trunc-run-0.c | 39 +++ gcc/testsuite/gcc.target/riscv/rvv/rvv.exp| 4 +- 10 files changed, 371 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-rint-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-round-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-trunc-run-0.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c new file mode 100644 index 000..70cba3602bb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c @@ -0,0 +1,39 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include "test-math.h" + +#define ARRAY_SIZE 128 + +_Float16 in[ARRAY_SIZE]; +_Float16 out[ARRAY_SIZE]; +_Float16 ref[ARRAY_SIZE]; + +TEST_UNARY_CALL (_Float16, __builtin_ceilf16) +TEST_ASSERT (_Float16) + +TEST_INIT (_Float16, 1.2, 2.0, 1) +TEST_INIT (_Float16, -1.2, -1.0, 2) +TEST_INIT (_Float16, 3.0, 3.0, 3) +TEST_INIT (_Float16, 1023.5, 1024.0, 4) +TEST_INIT (_Float16, 1024.0, 1024.0, 5) +TEST_INIT (_Float16, 0.0, 0.0, 6) +TEST_INIT (_Float16, -0.0, -0.0, 7) +TEST_INIT (_Float16, -1023.5, -1023.0, 8) +TEST_INIT (_Float16, -1024.0, -1024.0, 9) + +int +main () +{ + RUN_TEST (_Float16, 1, __builtin_ceilf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 2, __builtin_ceilf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 3, __builtin_ceilf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 4, __builtin_ceilf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 5, __builtin_ceilf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 6, __builtin_ceilf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 7, __builtin_ceilf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 8, __builtin_ceilf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 9, __builtin_ceilf16, in, out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c new file mode 100644 index 000..c542278c1f5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c @@ -0,0 +1,39 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include "test-math.h" + +#define ARRAY_SIZE 128 + +_Float16 in[ARRAY_SIZE]; +_Float16 out[ARRAY_SIZE]; +_Float16 ref[ARRAY_SIZE]; + +TEST_UNARY_CALL (_Float16, __built
[PATCH v1] RISC-V: Refine bswap16 auto vectorization code gen
From: Pan Li This patch would like to refine the code gen for the bswap16. We will have VEC_PERM_EXPR after rtl expand when invoking __builtin_bswap. It will generate about 9 instructions in loop as below, no matter it is bswap16, bswap32 or bswap64. .L2: 1 vle16.v v4,0(a0) 2 vmv.v.x v2,a7 3 vand.vv v2,v6,v2 4 sllia2,a5,1 5 vrgatherei16.vv v1,v4,v2 6 sub a4,a4,a5 7 vse16.v v1,0(a3) 8 add a0,a0,a2 9 add a3,a3,a2 bne a4,zero,.L2 But for bswap16 we may have a even simple code gen, which has only 7 instructions in loop as below. .L5 1 vle8.v v2,0(a5) 2 addia5,a5,32 3 vsrl.vi v4,v2,8 4 vsll.vi v2,v2,8 5 vor.vv v4,v4,v2 6 vse8.v v4,0(a4) 7 addia4,a4,32 bne a5,a6,.L5 Unfortunately, this way will make the insn in loop will grow up to 13 and 24 for bswap32 and bswap64. Thus, we will refine the code gen for the bswap16 only, and leave both the bswap32 and bswap64 as is. gcc/ChangeLog: * config/riscv/riscv-v.cc (emit_vec_sll_scalar): New help func impl for emit vsll.vi/vsll.vx (emit_vec_srl_scalar): Likewise for vsrl.vi/vsrl.vx. (emit_vec_or): Likewise for vor.vv. (shuffle_bswap_pattern): New func impl for shuffle bswap. (expand_vec_perm_const_1): Add shuffle bswap pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/perm-4.c: Adjust checker. * gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/bswap16-0.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 117 ++ .../riscv/rvv/autovec/unop/bswap16-0.c| 17 +++ .../riscv/rvv/autovec/unop/bswap16-run-0.c| 44 +++ .../riscv/rvv/autovec/vls/bswap16-0.c | 34 + .../gcc.target/riscv/rvv/autovec/vls/perm-4.c | 4 +- 5 files changed, 214 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/bswap16-0.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 23633a2a74d..3e3b5f2e797 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -878,6 +878,33 @@ emit_vlmax_decompress_insn (rtx target, rtx op0, rtx op1, rtx mask) emit_vlmax_masked_gather_mu_insn (target, op1, sel, mask); } +static void +emit_vec_sll_scalar (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode) +{ + rtx sll_ops[] = {op_0, op_1, op_2}; + insn_code icode = code_for_pred_scalar (ASHIFT, vec_mode); + + emit_vlmax_insn (icode, BINARY_OP, sll_ops); +} + +static void +emit_vec_srl_scalar (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode) +{ + rtx srl_ops[] = {op_0, op_1, op_2}; + insn_code icode = code_for_pred_scalar (LSHIFTRT, vec_mode); + + emit_vlmax_insn (icode, BINARY_OP, srl_ops); +} + +static void +emit_vec_or (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode) +{ + rtx or_ops[] = {op_0, op_1, op_2}; + insn_code icode = code_for_pred (IOR, vec_mode); + + emit_vlmax_insn (icode, BINARY_OP, or_ops); +} + /* Emit merge instruction. */ static machine_mode @@ -3030,6 +3057,94 @@ shuffle_decompress_patterns (struct expand_vec_perm_d *d) return true; } +static bool +shuffle_bswap_pattern (struct expand_vec_perm_d *d) +{ + HOST_WIDE_INT diff; + unsigned i, size, step; + + if (!d->one_vector_p || !d->perm[0].is_constant (&diff) || !diff) +return false; + + step = diff + 1; + size = step * GET_MODE_UNIT_BITSIZE (d->vmode); + + switch (size) +{ +case 16: + break; +case 32: +case 64: + /* We will have VEC_PERM_EXPR after rtl expand when invoking +__builtin_bswap. It will generate about 9 instructions in +loop as below, no matter it is bswap16, bswap32 or bswap64. + .L2: +1 vle16.v v4,0(a0) +2 vmv.v.x v2,a7 +3 vand.vv v2,v6,v2 +4 sllia2,a5,1 +5 vrgatherei16.vv v1,v4,v2 +6 sub a4,a4,a5 +7 vse16.v v1,0(a3) +8 add a0,a0,a2 +9 add a3,a3,a2 + bne a4,zero,.L2 + +But for bswap16 we may have a even simple code gen, which +has only 7 instructions in loop as below. + .L5 +1 vle8.v v2,0(a5) +2 addia5,a5,32 +3 vsrl.vi v4,v2,8 +4 vsll.vi v2,v2,8 +5 vor.vv v4,v4,v2 +6 vse8.v v4,0(a4) +7 addia4,a4,32 + bne a5,a6,.L5 + +Unfortunately, the instructions in loop will grow to 13 and 24 +for bswap32 and bswap64. Thus, we will leverage vrgather (9 insn) +for both the bswap64 and bswap32, but take shift and or (7 insn) +for bswap16. + */ +default: + return false; +} + + for (i = 0; i < step; i++) +if (!d->p
[PATCH v2] RISC-V: Refine bswap16 auto vectorization code gen
From: Pan Li Update in v2 * Remove emit helper functions. * Take expand_binop instead. Original log: This patch would like to refine the code gen for the bswap16. We will have VEC_PERM_EXPR after rtl expand when invoking __builtin_bswap. It will generate about 9 instructions in loop as below, no matter it is bswap16, bswap32 or bswap64. .L2: 1 vle16.v v4,0(a0) 2 vmv.v.x v2,a7 3 vand.vv v2,v6,v2 4 sllia2,a5,1 5 vrgatherei16.vv v1,v4,v2 6 sub a4,a4,a5 7 vse16.v v1,0(a3) 8 add a0,a0,a2 9 add a3,a3,a2 bne a4,zero,.L2 But for bswap16 we may have a even simple code gen, which has only 7 instructions in loop as below. .L5 1 vle8.v v2,0(a5) 2 addia5,a5,32 3 vsrl.vi v4,v2,8 4 vsll.vi v2,v2,8 5 vor.vv v4,v4,v2 6 vse8.v v4,0(a4) 7 addia4,a4,32 bne a5,a6,.L5 Unfortunately, this way will make the insn in loop will grow up to 13 and 24 for bswap32 and bswap64. Thus, we will refine the code gen for the bswap16 only, and leave both the bswap32 and bswap64 as is. gcc/ChangeLog: * config/riscv/riscv-v.cc (shuffle_bswap_pattern): New func impl for shuffle bswap. (expand_vec_perm_const_1): Add handling for shuffle bswap pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/perm-4.c: Adjust checker. * gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/bswap16-0.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 91 +++ .../riscv/rvv/autovec/unop/bswap16-0.c| 17 .../riscv/rvv/autovec/unop/bswap16-run-0.c| 44 + .../riscv/rvv/autovec/vls/bswap16-0.c | 34 +++ .../gcc.target/riscv/rvv/autovec/vls/perm-4.c | 4 +- 5 files changed, 188 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/bswap16-0.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 23633a2a74d..c72e411f125 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -3030,6 +3030,95 @@ shuffle_decompress_patterns (struct expand_vec_perm_d *d) return true; } +static bool +shuffle_bswap_pattern (struct expand_vec_perm_d *d) +{ + HOST_WIDE_INT diff; + unsigned i, size, step; + + if (!d->one_vector_p || !d->perm[0].is_constant (&diff) || !diff) +return false; + + step = diff + 1; + size = step * GET_MODE_UNIT_BITSIZE (d->vmode); + + switch (size) +{ +case 16: + break; +case 32: +case 64: + /* We will have VEC_PERM_EXPR after rtl expand when invoking +__builtin_bswap. It will generate about 9 instructions in +loop as below, no matter it is bswap16, bswap32 or bswap64. + .L2: +1 vle16.v v4,0(a0) +2 vmv.v.x v2,a7 +3 vand.vv v2,v6,v2 +4 sllia2,a5,1 +5 vrgatherei16.vv v1,v4,v2 +6 sub a4,a4,a5 +7 vse16.v v1,0(a3) +8 add a0,a0,a2 +9 add a3,a3,a2 + bne a4,zero,.L2 + +But for bswap16 we may have a even simple code gen, which +has only 7 instructions in loop as below. + .L5 +1 vle8.v v2,0(a5) +2 addia5,a5,32 +3 vsrl.vi v4,v2,8 +4 vsll.vi v2,v2,8 +5 vor.vv v4,v4,v2 +6 vse8.v v4,0(a4) +7 addia4,a4,32 + bne a5,a6,.L5 + +Unfortunately, the instructions in loop will grow to 13 and 24 +for bswap32 and bswap64. Thus, we will leverage vrgather (9 insn) +for both the bswap64 and bswap32, but take shift and or (7 insn) +for bswap16. + */ +default: + return false; +} + + for (i = 0; i < step; i++) +if (!d->perm.series_p (i, step, diff - i, step)) + return false; + + if (d->testing_p) +return true; + + machine_mode vhi_mode; + poly_uint64 vhi_nunits = exact_div (GET_MODE_NUNITS (d->vmode), 2); + + if (!get_vector_mode (HImode, vhi_nunits).exists (&vhi_mode)) +return false; + + /* Step-1: Move op0 to src with VHI mode. */ + rtx src = gen_reg_rtx (vhi_mode); + emit_move_insn (src, gen_lowpart (vhi_mode, d->op0)); + + /* Step-2: Shift right 8 bits to dest. */ + rtx dest = expand_binop (vhi_mode, lshr_optab, src, gen_int_mode (8, Pmode), + NULL_RTX, 0, OPTAB_DIRECT); + + /* Step-3: Shift left 8 bits to src. */ + src = expand_binop (vhi_mode, ashl_optab, src, gen_int_mode (8, Pmode), + NULL_RTX, 0, OPTAB_DIRECT); + + /* Step-4: Logic Or dest and src to dest. */ + dest = expand_binop (vhi_mode, ior_optab, dest, src, + NULL_RTX, 0, OPTAB_DIRECT); + + /* Step-5: Move src to target with VQI mode. */ + emit_move
[PATCH v1] RISC-V: Support FP lrint/lrintf auto vectorization
From: Pan Li This patch would like to support the FP lrint/lrintf auto vectorization. * long lrint (double) for rv64 * long lrintf (float) for rv32 Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lrintmn2 only act on DF => DI for rv64, and SF => SI for rv32. Given we have code like: void test_lrint (long *out, double *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrint (in[i]); } Before this patch: .L3: ... fld fa5,0(a1) fcvt.l.d a5,fa5,dyn sd a5,-8(a0) ... bne a1,a4,.L3 After this patch: .L3: ... vsetvli a3,zero,e64,m1,ta,ma vfcvt.x.f.v v1,v1 vsetvli zero,a2,e64,m1,ta,ma vse32.v v1,0(a0) ... bne a2,zero,.L3 The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION. gcc/ChangeLog: * config/riscv/autovec.md (lrint2): New pattern for lrint/lintf. * config/riscv/riscv-protos.h (expand_vec_lrint): New func decl for expanding lint. * config/riscv/riscv-v.cc (emit_vec_cvt_x_f): New helper func impl for vfcvt.x.f.v. (expand_vec_lrint): New function impl for expanding lint. * config/riscv/vector-iterators.md: New mode attr and iterator. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/test-math.h: New define for CVT like test case. * gcc.target/riscv/rvv/autovec/vls/def.h: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lrint-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lrint-run-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lrint-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lrint-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 11 +++ gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-v.cc | 20 ++ gcc/config/riscv/vector-iterators.md | 69 +++ .../riscv/rvv/autovec/unop/math-lrint-0.c | 14 .../riscv/rvv/autovec/unop/math-lrint-1.c | 14 .../riscv/rvv/autovec/unop/math-lrint-run-0.c | 63 + .../riscv/rvv/autovec/unop/math-lrint-run-1.c | 63 + .../riscv/rvv/autovec/unop/test-math.h| 24 +++ .../gcc.target/riscv/rvv/autovec/vls/def.h| 9 +++ .../riscv/rvv/autovec/vls/math-lrint-0.c | 30 .../riscv/rvv/autovec/vls/math-lrint-1.c | 30 12 files changed, 348 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lrint-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lrint-1.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 53e9d34eea1..dc76a01d82c 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2239,6 +2239,7 @@ (define_expand "avg3_ceil" ;; - round/roundf ;; - trunc/truncf ;; - roundeven/roundevenf +;; - lrint/lrintf ;; - (define_expand "ceil2" [(match_operand:V_VLSF 0 "register_operand") @@ -2309,3 +2310,13 @@ (define_expand "roundeven2" DONE; } ) + +(define_expand "lrint2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_FCONVERTL 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + { +riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); +DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 43426a5326b..f6bd15b47b0 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -474,6 +474,7 @@ void expand_vec_rint (rtx, rtx, machine_mode, machine_mode); void expand_vec_round (rtx, rtx, machine_mode, machine_mode); void expand_vec_trunc (rtx, rtx, machine_mode, machine_mode); void expand_vec_roundeven (rtx, rtx, machine_mode, machine_mode); +void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx)); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index c72e411f125..64f99d85d91 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -3911,6 +3911,16 @@ emit_vec_cvt_x_f (rtx op_dest, rtx op_src, rtx mask, emit
[PATCH v1] RISC-V: Support FP irintf auto vectorization
From: Pan Li This patch would like to support the FP irintf auto vectorization. * int irintf (float) Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lrintmn2 only act on SF => SI. Given we have code like: void test_irintf (int *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_irintf (in[i]); } Before this patch: .L3: ... flw fa5,0(a1) fcvt.w.s a5,fa5,dyn sw a5,-4(a0) ... bne a1,a4,.L3 After this patch: .L3: ... vle32.v v1,0(a1) vfcvt.x.f.v v1,v1 vse32.v v1,0(a0) ... bne a2,zero,.L3 The rest part like DF => SI/HF => SI will be covered by the hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION. gcc/ChangeLog: * config/riscv/autovec.md (lrint2): Rename from. (lrint2): Rename to. * config/riscv/vector-iterators.md: Rename and remove TARGET_64BIT. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-irint-0.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 9 ++- gcc/config/riscv/vector-iterators.md | 74 +-- .../riscv/rvv/autovec/unop/math-irint-0.c | 14 .../riscv/rvv/autovec/unop/math-irint-run-0.c | 63 .../riscv/rvv/autovec/vls/math-irint-0.c | 30 5 files changed, 149 insertions(+), 41 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index dc76a01d82c..c3a51e22ceb 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2240,6 +2240,7 @@ (define_expand "avg3_ceil" ;; - trunc/truncf ;; - roundeven/roundevenf ;; - lrint/lrintf +;; - irintf ;; - (define_expand "ceil2" [(match_operand:V_VLSF 0 "register_operand") @@ -2311,12 +2312,12 @@ (define_expand "roundeven2" } ) -(define_expand "lrint2" - [(match_operand: 0 "register_operand") - (match_operand:V_VLS_FCONVERTL 1 "register_operand")] +(define_expand "lrint2" + [(match_operand:0 "register_operand") + (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" { -riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); +riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); DONE; } ) diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index bb0c46ea30a..96ddd34c958 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -3281,8 +3281,8 @@ (define_mode_attr vnnconvert [ (V512DI "v512hf") ]) -;; L indicates convert to long -(define_mode_attr VLCONVERT [ +;; Convert to int, long and long long +(define_mode_attr V_I_L_LL_CONVERT [ (RVVM8SF "RVVM8SI") (RVVM4SF "RVVM4SI") (RVVM2SF "RVVM2SI") (RVVM1SF "RVVM1SI") (RVVMF2SF "RVVMF2SI") @@ -3298,7 +3298,7 @@ (define_mode_attr VLCONVERT [ (V512DF "V512DI") ]) -(define_mode_attr vlconvert [ +(define_mode_attr v_i_l_ll_convert [ (RVVM8SF "rvvm8si") (RVVM4SF "rvvm4si") (RVVM2SF "rvvm2si") (RVVM1SF "rvvm1si") (RVVMF2SF "rvvmf2si") @@ -3314,40 +3314,40 @@ (define_mode_attr vlconvert [ (V512DF "v512di") ]) -(define_mode_iterator V_VLS_FCONVERTL [ - (RVVM8SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT") - (RVVM4SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT") - (RVVM2SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT") - (RVVM1SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT") - (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN > 32") - - (RVVM8DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT") - (RVVM4DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT") - (RVVM2DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT") - (RVVM1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT") - - (V1SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT") - (V2SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT") - (V4SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT") - (V8SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT") - (V16SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 64") - (V32SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 128") - (V64SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 256") - (V128SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARG
[PATCH v1] RISC-V: Support FP llrint auto vectorization
From: Pan Li This patch would like to support the FP llrint auto vectorization. * long long llrint (double) This will be the CVT from DF => DI from the standard name's perpsective, which has been covered in previous PATCH(es). Thus, this patch only add some test cases. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/test-math.h: Add type int64_t. * gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c: New test. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-llrint-0.c| 14 + .../rvv/autovec/unop/math-llrint-run-0.c | 63 +++ .../riscv/rvv/autovec/unop/test-math.h| 2 + .../riscv/rvv/autovec/vls/math-llrint-0.c | 30 + 4 files changed, 109 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c new file mode 100644 index 000..2d90d232ba1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_int64_t___builtin_llrint: +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +*/ +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llrint) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c new file mode 100644 index 000..6b69f5568e9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c @@ -0,0 +1,63 @@ +/* { dg-do run { target { riscv_v && rv64 } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +int64_t out[ARRAY_SIZE]; +int64_t ref[ARRAY_SIZE]; + +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llrint) +TEST_ASSERT (int64_t) + +TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llrint (1.2), 1) +TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llrint (-1.2), 2) +TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llrint (0.5), 3) +TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llrint (-0.5), 4) +TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llrint (0.1), 5) +TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llrint (-0.1), 6) +TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llrint (3.0), 7) +TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llrint (-3.0), 8) +TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llrint (4503599627370495.5), 9) +TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llrint (4503599627370497.0), 10) +TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llrint (-4503599627370495.5), 11) +TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llrint (-4503599627370496.0), 12) +TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llrint (-0.0), 13) +TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llrint (-0.0), 14) +TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llrint (9223372036854774784.0), 15) +TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, __builtin_llrint (9223372036854775808.0), 16) +TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llrint (-9223372036854775808.0), 17) +TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, __builtin_llrint (-9223372036854777856.0), 18) +TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llrint (__builtin_inf ()), 19) +TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_llrint (-__builtin_inf ()), 20) +TEST_INIT_CVT (double, __builtin_nan (""), int64_t, 0x7fff, 21) + +int +main () +{ + RUN_TEST_CVT (double, int64_t, 1, __builtin_llrint, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 2, __builtin_llrint, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 3, __builtin_llrint, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 4, __builtin_llrint, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 5, __builtin_llrint, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 6, __builtin_llrint, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 7, __builtin_llrint, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 8, __builtin_
[PATCH v1] RISC-V: Support FP lround/lroundf auto vectorization
From: Pan Li This patch would like to support the FP lround/lroundf auto vectorization. * long lround (double) for rv64 * long lroundf (float) for rv32 Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lroundmn2 only act on DF => DI for rv64, and SF => SI for rv32. Given we have code like: void test_lround (long *out, double *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lround (in[i]); } Before this patch: .L3: ... fld fa5,0(a1) fcvt.l.d a5,fa5,rmm sd a5,-8(a0) ... bne a1,a4,.L3 After this patch: frrm a6 ... fsrmi4 // RMM .L3: ... vsetvli a3,zero,e64,m1,ta,ma vfcvt.x.f.v v1,v1 vsetvli zero,a2,e64,m1,ta,ma vse32.v v1,0(a0) ... bne a2,zero,.L3 ... fsrm a6 The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION. gcc/ChangeLog: * config/riscv/autovec.md (lround2): New pattern for lround/lroundf. * config/riscv/riscv-protos.h (enum insn_type): New enum value. (expand_vec_lround): New func decl for expanding lround. * config/riscv/riscv-v.cc (expand_vec_lround): New func impl for expanding lround. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-lround-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lround-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lround-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lround-run-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lround-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lround-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 10 +++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-v.cc | 10 +++ .../riscv/rvv/autovec/unop/math-lround-0.c| 19 + .../riscv/rvv/autovec/unop/math-lround-1.c| 19 + .../rvv/autovec/unop/math-lround-run-0.c | 72 +++ .../rvv/autovec/unop/math-lround-run-1.c | 72 +++ .../riscv/rvv/autovec/vls/math-lround-0.c | 30 .../riscv/rvv/autovec/vls/math-lround-1.c | 30 9 files changed, 264 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lround-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lround-1.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index ebc51ea69fd..33b11723c21 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2321,3 +2321,13 @@ (define_expand "lrint2" DONE; } ) + +(define_expand "lround2" + [(match_operand:0 "register_operand") + (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + { +riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); +DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 8c9f7e0ab11..b7eeeb8f55d 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -302,6 +302,7 @@ enum insn_type : unsigned int UNARY_OP_TAMA = __MASK_OP_TAMA | UNARY_OP_P, UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P, UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P, + UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P, UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P, UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P, UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P, @@ -475,6 +476,7 @@ void expand_vec_round (rtx, rtx, machine_mode, machine_mode); void expand_vec_trunc (rtx, rtx, machine_mode, machine_mode); void expand_vec_roundeven (rtx, rtx, machine_mode, machine_mode); void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode); +void expand_vec_lround (rtx, rtx, machine_mode, machine_mode); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx)); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index a75eb59eb43..b61c745678b 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4122,4 +4122,14 @@ expand_vec_lrint (rtx op_0, rtx op_1, machine_mode vec_fp_mode, emit_vec_cvt_x_f (op_0, op_1, UNARY_OP_FRM_DYN, vec_fp_mode); } +void +expand_vec_lround (rtx op_0, rtx op_1, machine_mode vec_fp_mode, + machine_mode vec_long_mode) +{ + gcc_assert (known_eq (GET_MODE_SIZE (vec_fp
[PATCH v1] RISC-V: Support FP lceil/lceilf auto vectorization
From: Pan Li This patch would like to support the FP lceil/lceilf auto vectorization. * long lceil (double) for rv64 * long lceilf (float) for rv32 Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lceilmn2 only act on DF => DI for rv64, and SF => SI for rv32. Given we have code like: void test_lceil (long *out, double *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lceil (in[i]); } Before this patch: .L3: ... fld fa5,0(a1) fcvt.l.da5,fa5,rup sd a5,-8(a0) ... bne a1,a4,.L3 After this patch: frrma6 ... fsrmi 3 // RUP .L3: ... vsetvli a3,zero,e64,m1,ta,ma vfcvt.x.f.v v1,v1 vsetvli zero,a2,e64,m1,ta,ma vse32.v v1,0(a0) ... bne a2,zero,.L3 ... fsrma6 The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION. gcc/ChangeLog: * config/riscv/autovec.md (lceil2): New pattern] for lceil/lceilf. * config/riscv/riscv-protos.h (enum insn_type): New enum value. (expand_vec_lceil): New func decl for expanding lceil. * config/riscv/riscv-v.cc (expand_vec_lceil): New func impl for expanding lceil. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lceil-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lceil-run-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lceil-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lceil-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 11 +++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-v.cc | 10 +++ .../riscv/rvv/autovec/unop/math-lceil-0.c | 19 + .../riscv/rvv/autovec/unop/math-lceil-1.c | 19 + .../riscv/rvv/autovec/unop/math-lceil-run-0.c | 69 +++ .../riscv/rvv/autovec/unop/math-lceil-run-1.c | 69 +++ .../riscv/rvv/autovec/vls/math-lceil-0.c | 30 .../riscv/rvv/autovec/vls/math-lceil-1.c | 30 9 files changed, 259 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lceil-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lceil-1.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 33b11723c21..267691a0095 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2241,6 +2241,7 @@ (define_expand "avg3_ceil" ;; - roundeven/roundevenf ;; - lrint/lrintf ;; - irintf +;; - lceil/lceilf ;; - (define_expand "ceil2" [(match_operand:V_VLSF 0 "register_operand") @@ -2331,3 +2332,13 @@ (define_expand "lround2" DONE; } ) + +(define_expand "lceil2" + [(match_operand:0 "register_operand") + (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + { +riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, mode); +DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index b7eeeb8f55d..ab65ab19524 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -303,6 +303,7 @@ enum insn_type : unsigned int UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P, UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P, UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P, + UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P, UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P, UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P, UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P, @@ -477,6 +478,7 @@ void expand_vec_trunc (rtx, rtx, machine_mode, machine_mode); void expand_vec_roundeven (rtx, rtx, machine_mode, machine_mode); void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode); void expand_vec_lround (rtx, rtx, machine_mode, machine_mode); +void expand_vec_lceil (rtx, rtx, machine_mode, machine_mode); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx)); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index b61c745678b..b03213dd8ed 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4132,4 +4132,14 @@ expand_vec_lround (rtx op_0, rtx op
[PATCH v1] RISC-V: Support FP lfloor/lfloorf auto vectorization
From: Pan Li This patch would like to support the FP lfloor/lfloorf auto vectorization. * long lfloor (double) for rv64 * long lfloorf (float) for rv32 Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lfloormn2 only act on DF => DI for rv64, and SF => SI for rv32. Given we have code like: void test_lfloor (long *out, double *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lfloor (in[i]); } Before this patch: .L3: ... fld fa5,0(a1) fcvt.l.da5,fa5,rdn sd a5,-8(a0) ... bne a1,a4,.L3 After this patch: frrma6 ... fsrmi 2 // RDN .L3: ... vsetvli a3,zero,e64,m1,ta,ma vfcvt.x.f.v v1,v1 vsetvli zero,a2,e64,m1,ta,ma vse32.v v1,0(a0) ... bne a2,zero,.L3 ... fsrma6 The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION. gcc/ChangeLog: * config/riscv/autovec.md (lfloor2): New pattern for lfloor/lfloorf. * config/riscv/riscv-protos.h (enum insn_type): New enum value. (expand_vec_lfloor): New func decl for expanding lfloor. * config/riscv/riscv-v.cc (expand_vec_lfloor): New func impl for expanding lfloor. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lfloor-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lfloor-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 11 +++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-v.cc | 10 +++ .../riscv/rvv/autovec/unop/math-lfloor-0.c| 19 + .../riscv/rvv/autovec/unop/math-lfloor-1.c| 19 + .../rvv/autovec/unop/math-lfloor-run-0.c | 69 +++ .../rvv/autovec/unop/math-lfloor-run-1.c | 69 +++ .../riscv/rvv/autovec/vls/math-lfloor-0.c | 30 .../riscv/rvv/autovec/vls/math-lfloor-1.c | 30 9 files changed, 259 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lfloor-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lfloor-1.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 267691a0095..c5b1e52cbf9 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2242,6 +2242,7 @@ (define_expand "avg3_ceil" ;; - lrint/lrintf ;; - irintf ;; - lceil/lceilf +;; - lfloor/lfloorf ;; - (define_expand "ceil2" [(match_operand:V_VLSF 0 "register_operand") @@ -2342,3 +2343,13 @@ (define_expand "lceil2" DONE; } ) + +(define_expand "lfloor2" + [(match_operand:0 "register_operand") + (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + { +riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, mode); +DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index ab65ab19524..49bdcdf2f93 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -304,6 +304,7 @@ enum insn_type : unsigned int UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P, UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P, UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P, + UNARY_OP_FRM_RDN = UNARY_OP | FRM_RDN_P, UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P, UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P, UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P, @@ -479,6 +480,7 @@ void expand_vec_roundeven (rtx, rtx, machine_mode, machine_mode); void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode); void expand_vec_lround (rtx, rtx, machine_mode, machine_mode); void expand_vec_lceil (rtx, rtx, machine_mode, machine_mode); +void expand_vec_lfloor (rtx, rtx, machine_mode, machine_mode); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx)); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index b03213dd8ed..21d86c3f917 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4142,4 +4142,14 @@ expand_vec_lceil (
[PATCH v1] RISC-V: Leverage stdint-gcc.h for RVV test cases
From: Pan Li Leverage stdint-gcc.h for the int64_t types instead of typedef. Or we may have conflict with stdint-gcc.h in somewhere else. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: Include stdint-gcc.h for int types. * gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/test-math.h: Remove int64_t typedef. Signed-off-by: Pan Li --- gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c | 1 + .../gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c | 1 + gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h | 2 -- 3 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c index 2d90d232ba1..4bf125f8cc8 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c @@ -2,6 +2,7 @@ /* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ /* { dg-final { check-function-bodies "**" "" } } */ +#include #include "test-math.h" /* diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c index 6b69f5568e9..409175a8dff 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c @@ -1,6 +1,7 @@ /* { dg-do run { target { riscv_v && rv64 } } } */ /* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ +#include #include "test-math.h" #define ARRAY_SIZE 128 diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h index 3867bc50a14..a1c9d55bd48 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h @@ -68,8 +68,6 @@ #define FRM_RMM 4 #define FRM_DYN 7 -typedef long long int64_t; - static inline void set_rm (unsigned rm) { -- 2.34.1
[PATCH v1] RISC-V: Add test for FP iroundf auto vectorization
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. int iroundf (float); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-iround-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-iround-0.c: New test. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-iround-0.c| 19 ++ .../rvv/autovec/unop/math-iround-run-0.c | 63 +++ .../riscv/rvv/autovec/vls/math-iround-0.c | 30 + 3 files changed, 112 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-iround-0.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c new file mode 100644 index 000..f32515d1403 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_int___builtin_iroundf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+4 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ret +*/ +TEST_UNARY_CALL_CVT (float, int, __builtin_iroundf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c new file mode 100644 index 000..2e05e443afe --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c @@ -0,0 +1,63 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include "test-math.h" + +#define ARRAY_SIZE 128 + +float in[ARRAY_SIZE]; +int out[ARRAY_SIZE]; +int ref[ARRAY_SIZE]; + +TEST_UNARY_CALL_CVT (float, int, __builtin_iroundf) +TEST_ASSERT (int) + +TEST_INIT_CVT (float, 1.2, int, __builtin_iroundf (1.2), 1) +TEST_INIT_CVT (float, -1.2, int, __builtin_iroundf (-1.2), 2) +TEST_INIT_CVT (float, 0.5, int, __builtin_iroundf (0.5), 3) +TEST_INIT_CVT (float, -0.5, int, __builtin_iroundf (-0.5), 4) +TEST_INIT_CVT (float, 0.1, int, __builtin_iroundf (0.1), 5) +TEST_INIT_CVT (float, -0.1, int, __builtin_iroundf (-0.1), 6) +TEST_INIT_CVT (float, 3.0, int, __builtin_iroundf (3.0), 7) +TEST_INIT_CVT (float, -3.0, int, __builtin_iroundf (-3.0), 8) +TEST_INIT_CVT (float, 8388607.5, int, __builtin_iroundf (8388607.5), 9) +TEST_INIT_CVT (float, 8388609.0, int, __builtin_iroundf (8388609.0), 10) +TEST_INIT_CVT (float, -8388607.5, int, __builtin_iroundf (-8388607.5), 11) +TEST_INIT_CVT (float, -8388609.0, int, __builtin_iroundf (-8388609.0), 12) +TEST_INIT_CVT (float, 0.0, int, __builtin_iroundf (-0.0), 13) +TEST_INIT_CVT (float, -0.0, int, __builtin_iroundf (-0.0), 14) +TEST_INIT_CVT (float, 2147483520.0, int, __builtin_iroundf (2147483520.0), 15) +TEST_INIT_CVT (float, 2147483648.0, int, 0x7fff, 16) +TEST_INIT_CVT (float, -2147483648.0, int, __builtin_iroundf (-2147483648.0), 17) +TEST_INIT_CVT (float, -2147483904.0, int, 0x8000, 18) +TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_iroundf (__builtin_inff ()), 19) +TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_iroundf (-__builtin_inff ()), 20) +TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fff, 21) + +int +main () +{ + RUN_TEST_CVT (float, int, 1, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 2, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 3, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 4, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 5, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 6, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 7, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 8, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 9, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 10, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 11, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 12, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 13, __builtin_iroundf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 1
[PATCH v1] RISC-V: Add test for FP llround auto vectorization
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. long long llround (double); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-llround-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-llround-0.c: New test. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-llround-0.c | 20 ++ .../rvv/autovec/unop/math-llround-run-0.c | 64 +++ .../riscv/rvv/autovec/vls/math-llround-0.c| 30 + 3 files changed, 114 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llround-0.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c new file mode 100644 index 000..4f8b4553a91 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include "test-math.h" + +/* +** test_double_int64_t___builtin_llround: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+4 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ret +*/ +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llround) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c new file mode 100644 index 000..c5b60847cc7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c @@ -0,0 +1,64 @@ +/* { dg-do run { target { riscv_v && rv64 } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +int64_t out[ARRAY_SIZE]; +int64_t ref[ARRAY_SIZE]; + +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llround) +TEST_ASSERT (int64_t) + +TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llround (1.2), 1) +TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llround (-1.2), 2) +TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llround (0.5), 3) +TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llround (-0.5), 4) +TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llround (0.1), 5) +TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llround (-0.1), 6) +TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llround (3.0), 7) +TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llround (-3.0), 8) +TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llround (4503599627370495.5), 9) +TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llround (4503599627370497.0), 10) +TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llround (-4503599627370495.5), 11) +TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llround (-4503599627370496.0), 12) +TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llround (-0.0), 13) +TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llround (-0.0), 14) +TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llround (9223372036854774784.0), 15) +TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, 0x7fff, 16) +TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llround (-9223372036854775808.0), 17) +TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, 0x8000, 18) +TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llround (__builtin_inf ()), 19) +TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_llround (-__builtin_inf ()), 20) +TEST_INIT_CVT (double, __builtin_nan (""), int64_t, 0x7fff, 21) + +int +main () +{ + RUN_TEST_CVT (double, int64_t, 1, __builtin_llround, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 2, __builtin_llround, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 3, __builtin_llround, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 4, __builtin_llround, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 5, __builtin_llround, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 6, __builtin_llround, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 7, __builtin_llround, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 8, __builtin_llround, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 9, __bui
[PATCH v1] RISC-V: Add test for FP llceil auto vectorization
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. long long llceil (double); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-llceil-0.c: New test. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-llceil-0.c| 20 ++ .../rvv/autovec/unop/math-llceil-run-0.c | 64 +++ .../riscv/rvv/autovec/vls/math-llceil-0.c | 30 + 3 files changed, 114 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llceil-0.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c new file mode 100644 index 000..3480c3ea91d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include "test-math.h" + +/* +** test_double_int64_t___builtin_llceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ret +*/ +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c new file mode 100644 index 000..5ccbe64ffb5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c @@ -0,0 +1,64 @@ +/* { dg-do run { target { riscv_v && rv64 } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +int64_t out[ARRAY_SIZE]; +int64_t ref[ARRAY_SIZE]; + +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llceil) +TEST_ASSERT (int64_t) + +TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llceil (1.2), 1) +TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llceil (-1.2), 2) +TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llceil (0.5), 3) +TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llceil (-0.5), 4) +TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llceil (0.1), 5) +TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llceil (-0.1), 6) +TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llceil (3.0), 7) +TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llceil (-3.0), 8) +TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llceil (4503599627370495.5), 9) +TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llceil (4503599627370497.0), 10) +TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llceil (-4503599627370495.5), 11) +TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llceil (-4503599627370496.0), 12) +TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llceil (-0.0), 13) +TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llceil (-0.0), 14) +TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llceil (9223372036854774784.0), 15) +TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, 0x7fff, 16) +TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llceil (-9223372036854775808.0), 17) +TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, 0x8000, 18) +TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llceil (__builtin_inf ()), 19) +TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_llceil (-__builtin_inf ()), 20) +TEST_INIT_CVT (double, __builtin_nan (""), int64_t, 0x7fff, 21) + +int +main () +{ + RUN_TEST_CVT (double, int64_t, 1, __builtin_llceil, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 2, __builtin_llceil, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 3, __builtin_llceil, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 4, __builtin_llceil, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 5, __builtin_llceil, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 6, __builtin_llceil, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 7, __builtin_llceil, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 8, __builtin_llceil, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 9, __builtin_llceil, in, out, ref, ARRAY_SIZE); +
[PATCH v1] RISC-V: Add test for FP iceil auto vectorization
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. int iceil (float); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-iceil-0.c: New test. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-iceil-0.c | 19 ++ .../riscv/rvv/autovec/unop/math-iceil-run-0.c | 63 +++ .../riscv/rvv/autovec/vls/math-iceil-0.c | 30 + 3 files changed, 112 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-iceil-0.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c new file mode 100644 index 000..2d4a1d163d1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_int___builtin_iceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ret +*/ +TEST_UNARY_CALL_CVT (float, int, __builtin_iceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c new file mode 100644 index 000..714173a7f8b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c @@ -0,0 +1,63 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include "test-math.h" + +#define ARRAY_SIZE 128 + +float in[ARRAY_SIZE]; +int out[ARRAY_SIZE]; +int ref[ARRAY_SIZE]; + +TEST_UNARY_CALL_CVT (float, int, __builtin_iceilf) +TEST_ASSERT (int) + +TEST_INIT_CVT (float, 1.2, int, __builtin_iceilf (1.2), 1) +TEST_INIT_CVT (float, -1.2, int, __builtin_iceilf (-1.2), 2) +TEST_INIT_CVT (float, 0.5, int, __builtin_iceilf (0.5), 3) +TEST_INIT_CVT (float, -0.5, int, __builtin_iceilf (-0.5), 4) +TEST_INIT_CVT (float, 0.1, int, __builtin_iceilf (0.1), 5) +TEST_INIT_CVT (float, -0.1, int, __builtin_iceilf (-0.1), 6) +TEST_INIT_CVT (float, 3.0, int, __builtin_iceilf (3.0), 7) +TEST_INIT_CVT (float, -3.0, int, __builtin_iceilf (-3.0), 8) +TEST_INIT_CVT (float, 8388607.5, int, __builtin_iceilf (8388607.5), 9) +TEST_INIT_CVT (float, 8388609.0, int, __builtin_iceilf (8388609.0), 10) +TEST_INIT_CVT (float, -8388607.5, int, __builtin_iceilf (-8388607.5), 11) +TEST_INIT_CVT (float, -8388609.0, int, __builtin_iceilf (-8388609.0), 12) +TEST_INIT_CVT (float, 0.0, int, __builtin_iceilf (-0.0), 13) +TEST_INIT_CVT (float, -0.0, int, __builtin_iceilf (-0.0), 14) +TEST_INIT_CVT (float, 2147483520.0, int, __builtin_iceilf (2147483520.0), 15) +TEST_INIT_CVT (float, 2147483648.0, int, 0x7fff, 16) +TEST_INIT_CVT (float, -2147483648.0, int, __builtin_iceilf (-2147483648.0), 17) +TEST_INIT_CVT (float, -2147483904.0, int, 0x8000, 18) +TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_iceilf (__builtin_inff ()), 19) +TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_iceilf (-__builtin_inff ()), 20) +TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fff, 21) + +int +main () +{ + RUN_TEST_CVT (float, int, 1, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 2, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 3, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 4, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 5, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 6, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 7, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 8, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 9, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 10, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 11, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 12, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 13, __builtin_iceilf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 14, __builtin_iceilf, in, out, ref, ARRAY_SIZE); +
[PATCH v1] RISC-V: Add test for FP ifloor auto vectorization
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. int ifloor (float); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-ifloor-0.c: New test. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-ifloor-0.c| 19 ++ .../rvv/autovec/unop/math-ifloor-run-0.c | 63 +++ .../riscv/rvv/autovec/vls/math-ifloor-0.c | 30 + 3 files changed, 112 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-ifloor-0.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c new file mode 100644 index 000..b9ec415d690 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_int___builtin_ifloorf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+2 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ret +*/ +TEST_UNARY_CALL_CVT (float, int, __builtin_ifloorf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c new file mode 100644 index 000..8ef4da0ea88 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c @@ -0,0 +1,63 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include "test-math.h" + +#define ARRAY_SIZE 128 + +float in[ARRAY_SIZE]; +int out[ARRAY_SIZE]; +int ref[ARRAY_SIZE]; + +TEST_UNARY_CALL_CVT (float, int, __builtin_ifloorf) +TEST_ASSERT (int) + +TEST_INIT_CVT (float, 1.2, int, __builtin_ifloorf (1.2), 1) +TEST_INIT_CVT (float, -1.2, int, __builtin_ifloorf (-1.2), 2) +TEST_INIT_CVT (float, 0.5, int, __builtin_ifloorf (0.5), 3) +TEST_INIT_CVT (float, -0.5, int, __builtin_ifloorf (-0.5), 4) +TEST_INIT_CVT (float, 0.1, int, __builtin_ifloorf (0.1), 5) +TEST_INIT_CVT (float, -0.1, int, __builtin_ifloorf (-0.1), 6) +TEST_INIT_CVT (float, 3.0, int, __builtin_ifloorf (3.0), 7) +TEST_INIT_CVT (float, -3.0, int, __builtin_ifloorf (-3.0), 8) +TEST_INIT_CVT (float, 8388607.5, int, __builtin_ifloorf (8388607.5), 9) +TEST_INIT_CVT (float, 8388609.0, int, __builtin_ifloorf (8388609.0), 10) +TEST_INIT_CVT (float, -8388607.5, int, __builtin_ifloorf (-8388607.5), 11) +TEST_INIT_CVT (float, -8388609.0, int, __builtin_ifloorf (-8388609.0), 12) +TEST_INIT_CVT (float, 0.0, int, __builtin_ifloorf (-0.0), 13) +TEST_INIT_CVT (float, -0.0, int, __builtin_ifloorf (-0.0), 14) +TEST_INIT_CVT (float, 2147483520.0, int, __builtin_ifloorf (2147483520.0), 15) +TEST_INIT_CVT (float, 2147483648.0, int, 0x7fff, 16) +TEST_INIT_CVT (float, -2147483648.0, int, __builtin_ifloorf (-2147483648.0), 17) +TEST_INIT_CVT (float, -2147483904.0, int, 0x8000, 18) +TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_ifloorf (__builtin_inff ()), 19) +TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_ifloorf (-__builtin_inff ()), 20) +TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fff, 21) + +int +main () +{ + RUN_TEST_CVT (float, int, 1, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 2, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 3, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 4, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 5, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 6, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 7, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 8, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 9, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 10, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 11, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 12, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 13, __builtin_ifloorf, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (float, int, 14
[PATCH v1] RISC-V: Add test for FP llfloor auto vectorization
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. long long llfloor (double); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-llfloor-0.c: New test. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-llfloor-0.c | 20 ++ .../rvv/autovec/unop/math-llfloor-run-0.c | 64 +++ .../riscv/rvv/autovec/vls/math-llfloor-0.c| 30 + 3 files changed, 114 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llfloor-0.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c new file mode 100644 index 000..4b10f966015 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include "test-math.h" + +/* +** test_double_int64_t___builtin_llfloor: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+2 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ret +*/ +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llfloor) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c new file mode 100644 index 000..22829132e96 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c @@ -0,0 +1,64 @@ +/* { dg-do run { target { riscv_v && rv64 } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +int64_t out[ARRAY_SIZE]; +int64_t ref[ARRAY_SIZE]; + +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llfloor) +TEST_ASSERT (int64_t) + +TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llfloor (1.2), 1) +TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llfloor (-1.2), 2) +TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llfloor (0.5), 3) +TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llfloor (-0.5), 4) +TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llfloor (0.1), 5) +TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llfloor (-0.1), 6) +TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llfloor (3.0), 7) +TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llfloor (-3.0), 8) +TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llfloor (4503599627370495.5), 9) +TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llfloor (4503599627370497.0), 10) +TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llfloor (-4503599627370495.5), 11) +TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llfloor (-4503599627370496.0), 12) +TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llfloor (-0.0), 13) +TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llfloor (-0.0), 14) +TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llfloor (9223372036854774784.0), 15) +TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, 0x7fff, 16) +TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llfloor (-9223372036854775808.0), 17) +TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, 0x8000, 18) +TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llfloor (__builtin_inf ()), 19) +TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_llfloor (-__builtin_inf ()), 20) +TEST_INIT_CVT (double, __builtin_nan (""), int64_t, 0x7fff, 21) + +int +main () +{ + RUN_TEST_CVT (double, int64_t, 1, __builtin_llfloor, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 2, __builtin_llfloor, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 3, __builtin_llfloor, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 4, __builtin_llfloor, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 5, __builtin_llfloor, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 6, __builtin_llfloor, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 7, __builtin_llfloor, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 8, __builtin_llfloor, in, out, ref, ARRAY_SIZE); + RUN_TEST_CVT (double, int64_t, 9, __bui
[PATCH v1] RISC-V: Refine run test cases of math autovec
From: Pan Li For the run test cases of math autovec, we need a reference value to check if the return value is expected or not. The previous patch leverage hardcode for the reference value but we can leverage the scalar math function instead. For example ceil after autovec. ASSERT (CEIL (Vector {1.2,...}) == Vector {2.0, ...}); But we can leverage the scalar math function to avoid potential mistakes. ASSERT (CEIL (Vector {1.2,...}) == Vector {ceil (1.2), ...}); This patch remove some fflags check as it covered by check-body already. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c: Use scalar func as reference instead of hardcode. * gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-floor-run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-floor-run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-rint-run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-rint-run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-round-run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-round-run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-trunc-run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-trunc-run-2.c: Ditto. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-ceil-run-1.c | 18 +- .../riscv/rvv/autovec/unop/math-ceil-run-2.c | 18 +- .../riscv/rvv/autovec/unop/math-floor-run-1.c | 18 +- .../riscv/rvv/autovec/unop/math-floor-run-2.c | 18 +- .../rvv/autovec/unop/math-nearbyint-run-1.c | 33 ++- .../rvv/autovec/unop/math-nearbyint-run-2.c | 33 ++- .../riscv/rvv/autovec/unop/math-rint-run-1.c | 33 ++- .../riscv/rvv/autovec/unop/math-rint-run-2.c | 33 ++- .../riscv/rvv/autovec/unop/math-round-run-1.c | 18 +- .../riscv/rvv/autovec/unop/math-round-run-2.c | 18 +- .../riscv/rvv/autovec/unop/math-trunc-run-1.c | 18 +- .../riscv/rvv/autovec/unop/math-trunc-run-2.c | 18 +- 12 files changed, 140 insertions(+), 136 deletions(-) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c index 88611e8268e..419a3def4df 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c @@ -12,15 +12,15 @@ float ref[ARRAY_SIZE]; TEST_UNARY_CALL (float, __builtin_ceilf) TEST_ASSERT (float) -TEST_INIT (float, 1.2, 2.0, 1) -TEST_INIT (float, -1.2, -1.0, 2) -TEST_INIT (float, 3.0, 3.0, 3) -TEST_INIT (float, 8388607.5, 8388608.0, 4) -TEST_INIT (float, 8388609.0, 8388609.0, 5) -TEST_INIT (float, 0.0, 0.0, 6) -TEST_INIT (float, -0.0, -0.0, 7) -TEST_INIT (float, -8388607.5, -8388607.0, 8) -TEST_INIT (float, -8388608.0, -8388608.0, 9) +TEST_INIT (float, 1.2, __builtin_ceilf (1.2), 1) +TEST_INIT (float, -1.2, __builtin_ceilf (-1.2), 2) +TEST_INIT (float, 3.0, __builtin_ceilf (3.0), 3) +TEST_INIT (float, 8388607.5, __builtin_ceilf (8388607.5), 4) +TEST_INIT (float, 8388609.0, __builtin_ceilf (8388609.0), 5) +TEST_INIT (float, 0.0, __builtin_ceilf (0.0), 6) +TEST_INIT (float, -0.0,__builtin_ceilf (-0.0), 7) +TEST_INIT (float, -8388607.5, __builtin_ceilf (-8388607.5), 8) +TEST_INIT (float, -8388608.0, __builtin_ceilf (-8388608.0), 9) int main () diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c index bb4c86c3d12..2b29c8e4414 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c @@ -12,15 +12,15 @@ double ref[ARRAY_SIZE]; TEST_UNARY_CALL (double, __builtin_ceil) TEST_ASSERT (double) -TEST_INIT (double, 1.2, 2.0, 1) -TEST_INIT (double, -1.2, -1.0, 2) -TEST_INIT (double, 3.0, 3.0, 3) -TEST_INIT (double, 4503599627370495.5, 4503599627370496.0, 4) -TEST_INIT (double, 4503599627370497.0, 4503599627370497.0, 5) -TEST_INIT (double, 0.0, 0.0, 6) -TEST_INIT (double, -0.0, -0.0, 7) -TEST_INIT (double, -4503599627370495.5, -4503599627370495.0, 8) -TEST_INIT (double, -4503599627370496.0, -4503599627370496.0, 9) +TEST_INIT (double, 1.2, __builtin_ceil (1.2), 1) +TEST_INIT (double, -1.2, __builtin_ceil (-1.2), 2) +TEST_INIT (double, 3.0, __builtin_ceil (3.0), 3) +TEST_INIT (double, 4503599627370495.5, __builtin_ceil (4503599627370495.5), 4) +TEST_INIT (double, 4503599627370497.0, __builtin_ceil (4503599627370497.0), 5) +TEST_INIT (double, 0.0, __builtin_ceil (0.0), 6) +TEST_INIT (double, -0.0, __builtin_ceil (-0.0), 7) +TEST_INIT (double, -4503599627370495.5, __builtin_ceil (-450
[PATCH v1] RISC-V: Remove the type size restriction of vectorizer
From: Pan Li The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrintf (in[i]); } lrintf.c:5:26: missed: couldn't vectorize loop lrintf.c:5:26: missed: not vectorized: unsupported data-type Then the standard name pattern like lrintmn2 cannot work for different data type size like SF => DI. This patch would like to remove this data type size check and unblock the standard name like lrintmn2. Passed the x86 bootstrap and regression test already. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_call): Remove data size check. Signed-off-by: Pan Li --- gcc/tree-vect-stmts.cc | 13 - 1 file changed, 13 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index b3a56498595..326e000a71d 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -3529,19 +3529,6 @@ vectorizable_call (vec_info *vinfo, return false; } - /* FORNOW: we don't yet support mixtures of vector sizes for calls, - just mixtures of nunits. E.g. DI->SI versions of __builtin_ctz* - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed - by a pack of the two vectors into an SI vector. We would need - separate code to handle direct VnDI->VnSI IFN_CTZs. */ - if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)) -{ - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -"mismatched vector sizes %T and %T\n", -vectype_in, vectype_out); - return false; -} if (VECTOR_BOOLEAN_TYPE_P (vectype_out) != VECTOR_BOOLEAN_TYPE_P (vectype_in)) -- 2.34.1
[PATCH v1] RISC-V: Bugfix for merging undefined tmp register in math
From: Pan Li For math function autovec, there will be one step like rtx tmp = gen_reg_rtx (vec_int_mode); emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode); The MU will leave the tmp (aka dest register) register unmasked elements unchanged and it is undefined here. This patch would like to adjust the MU to MA. gcc/ChangeLog: * config/riscv/riscv-protos.h (enum insn_type): Add new type values. * config/riscv/riscv-v.cc (emit_vec_cvt_x_f): Add undef merge operand handling. (expand_vec_ceil): Take MA instead of MU for tmp register. (expand_vec_floor): Ditto. (expand_vec_nearbyint): Ditto. (expand_vec_rint): Ditto. (expand_vec_round): Ditto. (expand_vec_roundeven): Ditto. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-protos.h | 5 + gcc/config/riscv/riscv-v.cc | 24 2 files changed, 21 insertions(+), 8 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index f7a9a02f1f9..5dc97c2adc0 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -306,6 +306,11 @@ enum insn_type : unsigned int UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P, UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P, UNARY_OP_FRM_RDN = UNARY_OP | FRM_RDN_P, + UNARY_OP_TAMA_FRM_DYN = UNARY_OP_TAMA | FRM_DYN_P, + UNARY_OP_TAMA_FRM_RUP = UNARY_OP_TAMA | FRM_RUP_P, + UNARY_OP_TAMA_FRM_RDN = UNARY_OP_TAMA | FRM_RDN_P, + UNARY_OP_TAMA_FRM_RMM = UNARY_OP_TAMA | FRM_RMM_P, + UNARY_OP_TAMA_FRM_RNE = UNARY_OP_TAMA | FRM_RNE_P, UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P, UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P, UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P, diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 383af55fe3a..91ad6a61fa8 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4108,10 +4108,18 @@ static void emit_vec_cvt_x_f (rtx op_dest, rtx op_src, rtx mask, insn_type type, machine_mode vec_mode) { - rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src}; insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, vec_mode); - emit_vlmax_insn (icode, type, cvt_x_ops); + if (type & USE_VUNDEF_MERGE_P) +{ + rtx cvt_x_ops[] = {op_dest, mask, op_src}; + emit_vlmax_insn (icode, type, cvt_x_ops); +} + else +{ + rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src}; + emit_vlmax_insn (icode, type, cvt_x_ops); +} } static void @@ -4157,7 +4165,7 @@ expand_vec_ceil (rtx op_0, rtx op_1, machine_mode vec_fp_mode, /* Step-3: Convert to integer on mask, with rounding up (aka ceil). */ rtx tmp = gen_reg_rtx (vec_int_mode); - emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RUP, vec_fp_mode); + emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_RUP, vec_fp_mode); /* Step-4: Convert to floating-point on mask for the final result. To avoid unnecessary frm register access, we use RUP here and it will @@ -4182,7 +4190,7 @@ expand_vec_floor (rtx op_0, rtx op_1, machine_mode vec_fp_mode, /* Step-3: Convert to integer on mask, with rounding down (aka floor). */ rtx tmp = gen_reg_rtx (vec_int_mode); - emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RDN, vec_fp_mode); + emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_RDN, vec_fp_mode); /* Step-4: Convert to floating-point on mask for the floor result. */ emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_RDN, vec_fp_mode); @@ -4208,7 +4216,7 @@ expand_vec_nearbyint (rtx op_0, rtx op_1, machine_mode vec_fp_mode, /* Step-4: Convert to integer on mask, with rounding down (aka nearbyint). */ rtx tmp = gen_reg_rtx (vec_int_mode); - emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode); + emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_DYN, vec_fp_mode); /* Step-5: Convert to floating-point on mask for the nearbyint result. */ emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode); @@ -4233,7 +4241,7 @@ expand_vec_rint (rtx op_0, rtx op_1, machine_mode vec_fp_mode, /* Step-3: Convert to integer on mask, with dyn rounding (aka rint). */ rtx tmp = gen_reg_rtx (vec_int_mode); - emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode); + emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_DYN, vec_fp_mode); /* Step-4: Convert to floating-point on mask for the rint result. */ emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode); @@ -4255,7 +4263,7 @@ expand_vec_round (rtx op_0, rtx op_1, machine_mode vec_fp_mode, /* Step-3: Convert to integer on mask, rounding to nearest (aka round). */ rtx tmp = gen_reg_rtx (vec_int_mode); - emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RMM, vec_fp_mode); + emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_RMM, vec_fp_mode); /* Step-4: Convert to fl
[PATCH v1] RISC-V: Remove unnecessary asm check for rounding autovec
From: Pan Li The vsetvl asm check is unnecessary for the rounding function autovec. These rounding test cases should focus on the rounding insn sequence. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: Remove the vsetvl check. * gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-floor-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-floor-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-floor-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-floor-3.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-iround-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-llround-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lround-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lround-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-nearbyint-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-nearbyint-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-nearbyint-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-nearbyint-3.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-rint-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-rint-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-rint-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-rint-3.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-round-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-round-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-round-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-round-3.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-roundeven-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-roundeven-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-roundeven-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-roundeven-3.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-trunc-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-trunc-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-trunc-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-trunc-3.c: Ditto. Signed-off-by: Pan Li --- gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c| 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c| 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c| 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c| 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-1.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-2.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-3.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c | 1 - gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrou
[PATCH v1] RISC-V: Remove unnecessary asm check for binop constraint
From: Pan Li The vsetvl asm check is unnecessary for the binop constraint. We should be focus for constrait and leave the vsetvl test to the vsetvl pass. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/binop_vv_constraint-1.c: Remove the vsetvl asm check from func body. * gcc.target/riscv/rvv/base/binop_vx_constraint-1.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-10.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-11.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-12.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-129.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-13.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-130.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-131.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-133.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-134.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-135.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-14.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-15.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-153.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-154.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-155.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-158.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-16.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-17.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-171.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-172.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-173.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-174.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-18.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-19.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-2.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-20.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-21.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-22.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-23.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-24.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-25.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-26.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-27.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-28.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-29.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-3.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-30.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-31.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-32.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-33.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-34.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-35.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-36.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-37.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-38.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-39.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-4.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-40.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-41.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-42.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-43.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-44.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-5.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-6.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-7.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-8.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-9.c: Ditto. * gcc.target/riscv/rvv/base/shift_vx_constraint-1.c: Ditto. * gcc.target/riscv/rvv/base/ternop_vv_constraint-1.c: Ditto. * gcc.target/riscv/rvv/base/ternop_vv_constraint-2.c: Ditto. * gcc.target/riscv/rvv/base/ternop_vv_constraint-3.c: Ditto. * gcc.target/riscv/rvv/base/ternop_vv_constraint-4.c: Ditto. * gcc.target/riscv/rvv/base/ternop_vv_constraint-5.c: Ditto. * gcc.target/riscv/rvv/base/ternop_vv_constraint-6.c: Ditto. * gcc.target/riscv/rvv/base/ternop_vx_constraint-1.c: Ditto. * gcc.target/riscv/rvv/base/ternop_vx_constraint-8.c: Ditto. * gcc.target/riscv/rvv/base/ternop_vx_constraint-9.c: Ditto.
[PATCH v1] RISC-V: Bugfix for merging undef tmp register for trunc
From: Pan Li For trunc function autovec, there will be one step like below take MU for the merge operand. rtx tmp = gen_reg_rtx (vec_int_mode); emit_vec_cvt_x_f_rtz (tmp, op_1, mask, vec_fp_mode); The MU will leave the tmp (aka dest register) register unmasked elements unchanged and it is undefined here. This patch would like to adjust the MU to MA. gcc/ChangeLog: * config/riscv/riscv-v.cc (emit_vec_cvt_x_f_rtz): Add insn type arg. (expand_vec_trunc): Take MA instead of MU for cvt_x_f_rtz. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 91ad6a61fa8..fb6a4e561db 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4144,12 +4144,20 @@ emit_vec_cvt_f_x (rtx op_dest, rtx op_src, rtx mask, static void emit_vec_cvt_x_f_rtz (rtx op_dest, rtx op_src, rtx mask, - machine_mode vec_mode) + insn_type type, machine_mode vec_mode) { - rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src}; insn_code icode = code_for_pred (FIX, vec_mode); - emit_vlmax_insn (icode, UNARY_OP_TAMU, cvt_x_ops); + if (type & USE_VUNDEF_MERGE_P) +{ + rtx cvt_x_ops[] = {op_dest, mask, op_src}; + emit_vlmax_insn (icode, type, cvt_x_ops); +} + else +{ + rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src}; + emit_vlmax_insn (icode, type, cvt_x_ops); +} } void @@ -4285,7 +4293,7 @@ expand_vec_trunc (rtx op_0, rtx op_1, machine_mode vec_fp_mode, /* Step-3: Convert to integer on mask, rounding to zero (aka truncate). */ rtx tmp = gen_reg_rtx (vec_int_mode); - emit_vec_cvt_x_f_rtz (tmp, op_1, mask, vec_fp_mode); + emit_vec_cvt_x_f_rtz (tmp, op_1, mask, UNARY_OP_TAMA, vec_fp_mode); /* Step-4: Convert to floating-point on mask for the rint result. */ emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode); -- 2.34.1
[PATCH v1] RISC-V: Remove unnecessary asm check for vec cvt
From: Pan Li The vsetvl asm check is unnecessary for the vector convert. We should be focus for constrait and leave the vsetvl test to the vsetvl pass. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/cvt-0.c: Remove the vsetvl asm check from func body. * gcc.target/riscv/rvv/autovec/unop/cvt-1.c: Ditto. Signed-off-by: Pan Li --- gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c | 3 +-- gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c index 762b1408994..7d66ed3e943 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c @@ -7,9 +7,8 @@ /* ** test_int65_to_fp16: ** ... -** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*mf2,\s*ta,\s*ma ** vfncvt\.f\.x\.w\s+v[0-9]+,\s*v[0-9]+ -** vsetvli\s+zero,\s*zero,\s*e16,\s*mf4,\s*ta,\s*ma +** ... ** vfncvt\.f\.f\.w\s+v[0-9]+,\s*v[0-9]+ ** ... */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c index 3180ba3612c..af08c51ef8b 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c @@ -7,9 +7,8 @@ /* ** test_uint65_to_fp16: ** ... -** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*mf2,\s*ta,\s*ma ** vfncvt\.f\.xu\.w\s+v[0-9]+,\s*v[0-9]+ -** vsetvli\s+zero,\s*zero,\s*e16,\s*mf4,\s*ta,\s*ma +** ... ** vfncvt\.f\.f\.w\s+v[0-9]+,\s*v[0-9]+ ** ... */ -- 2.34.1
[PATCH v2] VECT: Remove the type size restriction of vectorizer
From: Pan Li Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrintf (in[i]); } lrintf.c:5:26: missed: couldn't vectorize loop lrintf.c:5:26: missed: not vectorized: unsupported data-type Then the standard name pattern like lrintmn2 cannot work for different data type size like SF => DI. This patch would like to remove this data type size check and unblock the standard name like lrintmn2. The below test are passed for this patch. * The x86 bootstrap and regression test. * The aarch64 regression test. * The risc-v regression tests. gcc/ChangeLog: * internal-fn.cc (expand_fn_using_insn): Add vector int assertion. * tree-vect-stmts.cc (vectorizable_call): Remove size check. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/clrsb_1.c: Adjust checker. * gcc.target/aarch64/sve/clz_1.c: Ditto. * gcc.target/aarch64/sve/popcount_1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto. Signed-off-by: Pan Li --- gcc/internal-fn.cc | 3 ++- gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c | 3 +-- gcc/testsuite/gcc.target/aarch64/sve/clz_1.c| 3 +-- gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c | 3 +-- .../gcc.target/riscv/rvv/autovec/unop/popcount.c| 2 +- gcc/tree-vect-stmts.cc | 13 - 6 files changed, 6 insertions(+), 21 deletions(-) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 61d5a9e4772..17c0f4c3805 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, unsigned int noutputs, emit_move_insn (lhs_rtx, ops[0].value); else { - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))); + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + || VECTOR_INTEGER_TYPE_P (TREE_TYPE (lhs))); convert_move (lhs_rtx, ops[0].value, 0); } } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c index bdc9856faaf..940d08bbc7b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c @@ -18,5 +18,4 @@ clrsb_64 (unsigned int *restrict dst, uint64_t *restrict src, int size) } /* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */ -/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 2 } } */ -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c index 0c7a4e6d768..58b8ff406d2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c @@ -18,5 +18,4 @@ clz_64 (unsigned int *restrict dst, uint64_t *restrict src, int size) } /* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */ -/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 2 } } */ -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c b/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c index dfb6f4ac7a5..0eba898307c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c @@ -18,5 +18,4 @@ popcount_64 (unsigned int *restrict dst, uint64_t *restrict src, int size) } /* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */ -/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 2 } } */ -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c index 585a522aa81..e6e3c70f927 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c @@ -1461,4 +1461,4 @@ main () RUN_ALL () } -/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 229 "vect" } } */ +/* { dg-final { scan-tr
[PATCH v1] RISC-V: Fix one range-loop-construct warning of avlprop
From: Pan Li This patch would like to fix one warning of avlprop as below. ../../gcc/config/riscv/riscv-avlprop.cc: In member function 'virtual unsigned int pass_avlprop::execute(function*)': ../../gcc/config/riscv/riscv-avlprop.cc:346:23: error: loop variable 'candidate' creates a copy from type 'const std::pair' [-Werror=range-loop-construct] 346 | for (const auto candidate : m_candidates) | ^ ../../gcc/config/riscv/riscv-avlprop.cc:346:23: note: use reference type to prevent copying 346 | for (const auto candidate : m_candidates) | ^ | & gcc/ChangeLog: * config/riscv/riscv-avlprop.cc (pass_avlprop::execute): Use reference type to prevent copying. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-avlprop.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv-avlprop.cc b/gcc/config/riscv/riscv-avlprop.cc index 2c79ec81806..c59eb7f6fa3 100644 --- a/gcc/config/riscv/riscv-avlprop.cc +++ b/gcc/config/riscv/riscv-avlprop.cc @@ -343,7 +343,7 @@ pass_avlprop::execute (function *fn) { fprintf (dump_file, "\nNumber of potential AVL propagations: %d\n", m_candidates.length ()); - for (const auto candidate : m_candidates) + for (const auto &candidate : m_candidates) { fprintf (dump_file, "\nAVL propagation type: %s\n", avlprop_type_to_str (candidate.first)); -- 2.34.1
[PATCH v3] VECT: Refine the type size restriction of call vectorizer
From: Pan Li Update in v3: * Add func to predicate type size is legal or not for vectorizer call. Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrintf (in[i]); } lrintf.c:5:26: missed: couldn't vectorize loop lrintf.c:5:26: missed: not vectorized: unsupported data-type Then the standard name pattern like lrintmn2 cannot work for different data type size like SF => DI. This patch would like to refine this data type size check and unblock the standard name like lrintmn2 on conditions. The type size of vectype_out need to be exactly the same as the type size of vectype_in when the vectype_out size isn't participating in the optab selection. While there is no such restriction when the vectype_out is somehow a part of the optab query. The below test are passed for this patch. * The x86 bootstrap and regression test. * The aarch64 regression test. * The risc-v regression tests. * Ensure the lrintf standard name in risc-v. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_type_size_legal_p): New func impl to predicate the type size is legal or not. (vectorizable_call): Leverage vectorizable_type_size_legal_p. Signed-off-by: Pan Li --- gcc/tree-vect-stmts.cc | 51 +++--- 1 file changed, 38 insertions(+), 13 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a9200767f67..24b3448d961 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1430,6 +1430,35 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl, return IFN_LAST; } +/* Return TRUE when the type size is legal for the call vectorizer, + or FALSE. + The type size of both the vectype_in and vectype_out should be + exactly the same when vectype_out isn't participating the optab. + While there is no restriction for type size when vectype_out + is part of the optab query. + */ +static bool +vectorizable_type_size_legal_p (internal_fn ifn, tree vectype_out, + tree vectype_in) +{ + bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out); + + if (ifn == IFN_LAST || !direct_internal_fn_p (ifn)) +return same_size_p; + + const direct_internal_fn_info &difn_info = direct_internal_fn (ifn); + + if (!difn_info.vectorizable) +return same_size_p; + + /* According to vectorizable_internal_function, the type0/1 < 0 indicates + the vectype_out participating the optable selection. Aka the type size + check can be skipped here. */ + if (difn_info.type0 < 0 || difn_info.type1 < 0) +return true; + + return same_size_p; +} static tree permute_vec_elements (vec_info *, tree, tree, tree, stmt_vec_info, gimple_stmt_iterator *); @@ -3361,19 +3390,6 @@ vectorizable_call (vec_info *vinfo, return false; } - /* FORNOW: we don't yet support mixtures of vector sizes for calls, - just mixtures of nunits. E.g. DI->SI versions of __builtin_ctz* - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed - by a pack of the two vectors into an SI vector. We would need - separate code to handle direct VnDI->VnSI IFN_CTZs. */ - if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)) -{ - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -"mismatched vector sizes %T and %T\n", -vectype_in, vectype_out); - return false; -} if (VECTOR_BOOLEAN_TYPE_P (vectype_out) != VECTOR_BOOLEAN_TYPE_P (vectype_in)) @@ -3431,6 +3447,15 @@ vectorizable_call (vec_info *vinfo, ifn = vectorizable_internal_function (cfn, callee, vectype_out, vectype_in); + if (!vectorizable_type_size_legal_p (ifn, vectype_out, vectype_in)) +{ + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, +"mismatched vector sizes %T and %T\n", +vectype_in, vectype_out); + return false; +} + /* If that fails, try asking for a target-specific built-in function. */ if (ifn == IFN_LAST) { -- 2.34.1
[PATCH v4] VECT: Refine the type size restriction of call vectorizer
From: Pan Li Update in v4: * Append the check to vectorizable_internal_function. Update in v3: * Add func to predicate type size is legal or not for vectorizer call. Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrintf (in[i]); } lrintf.c:5:26: missed: couldn't vectorize loop lrintf.c:5:26: missed: not vectorized: unsupported data-type Then the standard name pattern like lrintmn2 cannot work for different data type size like SF => DI. This patch would like to refine this data type size check and unblock the standard name like lrintmn2 on conditions. The type size of vectype_out need to be exactly the same as the type size of vectype_in when the vectype_out size isn't participating in the optab selection. While there is no such restriction when the vectype_out is somehow a part of the optab query. The below test are passed for this patch. * The risc-v regression tests. * Ensure the lrintf standard name in risc-v. The below test are ongoing. * The x86 bootstrap and regression test. * The aarch64 regression test. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_internal_function): Add type size check for vectype_out doesn't participating for optab query. (vectorizable_call): Remove the type size check. Signed-off-by: Pan Li --- gcc/tree-vect-stmts.cc | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a9200767f67..799b4ab10c7 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1420,8 +1420,17 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl, const direct_internal_fn_info &info = direct_internal_fn (ifn); if (info.vectorizable) { + bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out); tree type0 = (info.type0 < 0 ? vectype_out : vectype_in); tree type1 = (info.type1 < 0 ? vectype_out : vectype_in); + + /* The type size of both the vectype_in and vectype_out should be +exactly the same when vectype_out isn't participating the optab. +While there is no restriction for type size when vectype_out +is part of the optab query. */ + if (type0 != vectype_out && type1 != vectype_out && !same_size_p) + return IFN_LAST; + if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1), OPTIMIZE_FOR_SPEED)) return ifn; @@ -3361,19 +3370,6 @@ vectorizable_call (vec_info *vinfo, return false; } - /* FORNOW: we don't yet support mixtures of vector sizes for calls, - just mixtures of nunits. E.g. DI->SI versions of __builtin_ctz* - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed - by a pack of the two vectors into an SI vector. We would need - separate code to handle direct VnDI->VnSI IFN_CTZs. */ - if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)) -{ - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -"mismatched vector sizes %T and %T\n", -vectype_in, vectype_out); - return false; -} if (VECTOR_BOOLEAN_TYPE_P (vectype_out) != VECTOR_BOOLEAN_TYPE_P (vectype_in)) -- 2.34.1
[PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]
From: Pan Li The extract_low_bits only try the scalar mode if the bitsize of the mode and src_mode is not equal. When vector mode is given from get_stored_val in DSE, it will always fail and return NULL_RTX. This patch would like to allow the vector mode in the extract_low_bits if and only if the size of mode is less than or equals to the size of the src_mode. Given below example code with --param=riscv-autovec-preference=fixed-vlmax. vuint8m1_t test () { uint8_t arr[32] = { 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9, 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9, }; return __riscv_vle8_v_u8m1(arr, 32); } Before this patch: test: lui a5,%hi(.LANCHOR0) addisp,sp,-32 addia5,a5,%lo(.LANCHOR0) li a3,32 vl2re64.v v2,0(a5) vsetvli zero,a3,e8,m1,ta,ma vs2r.v v2,0(sp) <== Unnecessary store to stack vle8.v v1,0(sp) <== Ditto vs1r.v v1,0(a0) addisp,sp,32 jr ra After this patch: test: lui a5,%hi(.LANCHOR0) addia5,a5,%lo(.LANCHOR0) li a4,32 addisp,sp,-32 vsetvli zero,a4,e8,m1,ta,ma vle8.v v1,0(a5) vs1r.v v1,0(a0) addisp,sp,32 jr ra Below tests are passed within this patch: * The x86 bootstrap and regression test. * The aarch64 regression test. * The risc-v regression test. PR target/111720 gcc/ChangeLog: * expmed.cc (extract_low_bits): Allow vector mode if the mode size is less than or equal to src_mode. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr111720-0.c: New test. * gcc.target/riscv/rvv/base/pr111720-1.c: New test. * gcc.target/riscv/rvv/base/pr111720-10.c: New test. * gcc.target/riscv/rvv/base/pr111720-2.c: New test. * gcc.target/riscv/rvv/base/pr111720-3.c: New test. * gcc.target/riscv/rvv/base/pr111720-4.c: New test. * gcc.target/riscv/rvv/base/pr111720-5.c: New test. * gcc.target/riscv/rvv/base/pr111720-6.c: New test. * gcc.target/riscv/rvv/base/pr111720-7.c: New test. * gcc.target/riscv/rvv/base/pr111720-8.c: New test. * gcc.target/riscv/rvv/base/pr111720-9.c: New test. Signed-off-by: Pan Li --- gcc/expmed.cc | 44 --- .../gcc.target/riscv/rvv/base/pr111720-0.c| 18 .../gcc.target/riscv/rvv/base/pr111720-1.c| 18 .../gcc.target/riscv/rvv/base/pr111720-10.c | 18 .../gcc.target/riscv/rvv/base/pr111720-2.c| 18 .../gcc.target/riscv/rvv/base/pr111720-3.c| 18 .../gcc.target/riscv/rvv/base/pr111720-4.c| 18 .../gcc.target/riscv/rvv/base/pr111720-5.c| 18 .../gcc.target/riscv/rvv/base/pr111720-6.c| 18 .../gcc.target/riscv/rvv/base/pr111720-7.c| 21 + .../gcc.target/riscv/rvv/base/pr111720-8.c| 18 .../gcc.target/riscv/rvv/base/pr111720-9.c| 15 +++ 12 files changed, 227 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c diff --git a/gcc/expmed.cc b/gcc/expmed.cc index b294eabb08d..5db83fe638c 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -2403,8 +2403,6 @@ extract_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode, rtx extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src) { - scalar_int_mode int_mode, src_int_mode; - if (mode == src_mode) return src; @@ -2437,22 +2435,38 @@ extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src) return x; } - if (!int_mode_for_mode (src_mode).exists (&src_int_mode) - || !int_mode_for_mode (mode).exists (&int_mode)) -return NULL_RTX; + if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode)) +{ + if (maybe_gt (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (src_mode)) + || !targetm.modes_tieable_p (mode, src_mode)) + return NULL_RTX; - if (!targetm.modes_tieable_p (src_int_mode, src_mode)) -return NULL_RTX; - if (!targetm.modes_tieable_p (int_mode, mode)) -return NULL_RTX; + /* For vector mode, only the bitsize (mode) <= bitsize (src_mode) and +tieable is allowed here. */ + src = gen_lowpart (mode, src); +} + else +{
[PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator
From: Pan Li The previous rounding API start with i/l/ll only works on the same mode types. For example as below, and we arrange the iterator similar to fcvt. * SF => SI * DF => DI After we refined this limination from middle-end, these API can also vectorized with different type sizes, aka: * HF => SI, HF => DI * SF => DI, SF => SI * DF => SI, DF => DI Then the iterator cannot take care of this simply and this patch would like to re-arrange the iterator in two items. * V_VLS_F_CONVERT_SI: handle (HF, SF, DF) => SI * V_VLS_F_CONVERT_DI: handle (HF, SF, DF) => DI As well as related mode_attr to reconcile the new iterator. gcc/ChangeLog: * config/riscv/autovec.md (lrint2): Remove. (lround2): Ditto. (lceil2): Ditto. (lfloor2): Ditto. (lrint2): New pattern for cvt from FP to SI. (lround2): Ditto. (lceil2): Ditto. (lfloor2): Ditto. (lrint2): New pattern for cvt from FP to DI. (lround2): Ditto. (lceil2): Ditto. (lfloor2): Ditto. * config/riscv/vector-iterators.md: Renew iterators for both the SI and DI. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 72 +++--- gcc/config/riscv/vector-iterators.md | 199 --- 2 files changed, 237 insertions(+), 34 deletions(-) diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index f5e3e347ace..81acb1a815b 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2395,42 +2395,82 @@ (define_expand "roundeven2" } ) -(define_expand "lrint2" - [(match_operand:0 "register_operand") - (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] +(define_expand "lrint2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" { -riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); +riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); DONE; } ) -(define_expand "lround2" - [(match_operand:0 "register_operand") - (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] +(define_expand "lrint2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" { -riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); +riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); DONE; } ) -(define_expand "lceil2" - [(match_operand:0 "register_operand") - (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] +(define_expand "lround2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" { -riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, mode); +riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); DONE; } ) -(define_expand "lfloor2" - [(match_operand:0 "register_operand") - (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] +(define_expand "lround2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" { -riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, mode); +riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); +DONE; + } +) + +(define_expand "lceil2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + { +riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, mode); +DONE; + } +) + +(define_expand "lceil2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + { +riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, mode); +DONE; + } +) + +(define_expand "lfloor2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + { +riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, mode); +DONE; + } +) + +(define_expand "lfloor2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + { +riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, mode); DONE; } ) diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index d9b5dec5edb..f2d9f60b631 100644 --- a/gcc/config/riscv/vector-iter
[PATCH v2] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator
From: Pan Li Update in v2: * Add mode size equal check to disable different mode size when expand, because the underlying codegen is not implemented yet. Original log: The previous rounding API start with i/l/ll only works on the same mode types. For example as below, and we arrange the iterator similar to fcvt. * SF => SI * DF => DI After we refined this limination from middle-end, these API can also vectorized with different type sizes, aka: * HF => SI, HF => DI * SF => DI, SF => SI * DF => SI, DF => DI Then the iterator cannot take care of this simply and this patch would like to re-arrange the iterator in two items. * V_VLS_F_CONVERT_SI: handle (HF, SF, DF) => SI * V_VLS_F_CONVERT_DI: handle (HF, SF, DF) => DI As well as related mode_attr to reconcile the new iterator. gcc/ChangeLog: * config/riscv/autovec.md (lrint2): Remove. (lround2): Ditto. (lceil2): Ditto. (lfloor2): Ditto. (lrint2): New pattern for cvt from FP to SI. (lround2): Ditto. (lceil2): Ditto. (lfloor2): Ditto. (lrint2): New pattern for cvt from FP to DI. (lround2): Ditto. (lceil2): Ditto. (lfloor2): Ditto. * config/riscv/vector-iterators.md: Renew iterators for both the SI and DI. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 90 +--- gcc/config/riscv/vector-iterators.md | 199 --- 2 files changed, 251 insertions(+), 38 deletions(-) diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index f5e3e347ace..cc4c9596bbf 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2395,42 +2395,92 @@ (define_expand "roundeven2" } ) -(define_expand "lrint2" - [(match_operand:0 "register_operand") - (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" +;; Add mode_size equal check as we opened the modes for different sizes. +;; The check will be removed soon after related codegen implemented +(define_expand "lrint2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (mode))" { -riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); +riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); DONE; } ) -(define_expand "lround2" - [(match_operand:0 "register_operand") - (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" +(define_expand "lrint2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (mode))" { -riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); +riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); DONE; } ) -(define_expand "lceil2" - [(match_operand:0 "register_operand") - (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" +(define_expand "lround2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (mode))" { -riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, mode); +riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); DONE; } ) -(define_expand "lfloor2" - [(match_operand:0 "register_operand") - (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" +(define_expand "lround2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (mode))" + { +riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); +DONE; + } +) + +(define_expand "lceil2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (mode))" + { +riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, mode); +DONE; + } +) + +(define_expand "lceil2" + [(match_operand: 0 "register_operand") + (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (mode))
[PATCH v1] RISC-V: Remove HF modes of FP to INT rounding autovec
From: Pan Li The [i|l|ll][rint|round|ceil|floor] internal functions are defined as DEF_INTERNAL_FLT_FN instead of DEF_INTERNAL_FLT_FLOATN_FN. Then the *f16 (N=16 of FLOATN) format of these functions are not available when try to get the ifn from the given cfn in the vectorizable_call. Aka: BUILT_IN_LRINTF16 => IFN_LAST (should be IFN_LRINT here) BUILT_IN_RINTF16 => IFN_RINT It is better to remove FP16 related modes until the additional middle-end support is ready. This patch would like to clean the FP16 modes with some comments. gcc/ChangeLog: * config/riscv/vector-iterators.md: Remove HF modes. Signed-off-by: Pan Li --- gcc/config/riscv/vector-iterators.md | 59 +--- 1 file changed, 2 insertions(+), 57 deletions(-) diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index f2d9f60b631..e80eaedc4b3 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -3221,20 +3221,15 @@ (define_mode_attr vnnconvert [ ;; V_F2SI_CONVERT: (HF, SF, DF) => SI ;; V_F2DI_CONVERT: (HF, SF, DF) => DI ;; +;; HF requires additional support from internal function, aka +;; gcc/internal-fn.def, remove HF shortly until the middle-end is ready. (define_mode_attr V_F2SI_CONVERT [ - (RVVM4HF "RVVM8SI") (RVVM2HF "RVVM4SI") (RVVM1HF "RVVM2SI") - (RVVMF2HF "RVVM1SI") (RVVMF4HF "RVVMF2SI") - (RVVM8SF "RVVM8SI") (RVVM4SF "RVVM4SI") (RVVM2SF "RVVM2SI") (RVVM1SF "RVVM1SI") (RVVMF2SF "RVVMF2SI") (RVVM8DF "RVVM4SI") (RVVM4DF "RVVM2SI") (RVVM2DF "RVVM1SI") (RVVM1DF "RVVMF2SI") - (V1HF "V1SI") (V2HF "V2SI") (V4HF "V4SI") (V8HF "V8SI") (V16HF "V16SI") - (V32HF "V32SI") (V64HF "V64SI") (V128HF "V128SI") (V256HF "V256SI") - (V512HF "V512SI") (V1024HF "V1024SI") - (V1SF "V1SI") (V2SF "V2SI") (V4SF "V4SI") (V8SF "V8SI") (V16SF "V16SI") (V32SF "V32SI") (V64SF "V64SI") (V128SF "V128SI") (V256SF "V256SI") (V512SF "V512SI") (V1024SF "V1024SI") @@ -3245,19 +3240,12 @@ (define_mode_attr V_F2SI_CONVERT [ ]) (define_mode_attr v_f2si_convert [ - (RVVM4HF "rvvm8si") (RVVM2HF "rvvm4si") (RVVM1HF "rvvm2si") - (RVVMF2HF "rvvm1si") (RVVMF4HF "rvvmf2si") - (RVVM8SF "rvvm8si") (RVVM4SF "rvvm4si") (RVVM2SF "rvvm2si") (RVVM1SF "rvvm1si") (RVVMF2SF "rvvmf2si") (RVVM8DF "rvvm4si") (RVVM4DF "rvvm2si") (RVVM2DF "rvvm1si") (RVVM1DF "rvvmf2si") - (V1HF "v1si") (V2HF "v2si") (V4HF "v4si") (V8HF "v8si") (V16HF "v16si") - (V32HF "v32si") (V64HF "v64si") (V128HF "v128si") (V256HF "v256si") - (V512HF "v512si") (V1024HF "v1024si") - (V1SF "v1si") (V2SF "v2si") (V4SF "v4si") (V8SF "v8si") (V16SF "v16si") (V32SF "v32si") (V64SF "v64si") (V128SF "v128si") (V256SF "v256si") (V512SF "v512si") (V1024SF "v1024si") @@ -3268,9 +3256,6 @@ (define_mode_attr v_f2si_convert [ ]) (define_mode_iterator V_VLS_F_CONVERT_SI [ - (RVVM4HF "TARGET_ZVFH") (RVVM2HF "TARGET_ZVFH") (RVVM1HF "TARGET_ZVFH") - (RVVMF2HF "TARGET_ZVFH") (RVVMF4HF "TARGET_ZVFH && TARGET_MIN_VLEN > 32") - (RVVM8SF "TARGET_VECTOR_ELEN_FP_32") (RVVM4SF "TARGET_VECTOR_ELEN_FP_32") (RVVM2SF "TARGET_VECTOR_ELEN_FP_32") (RVVM1SF "TARGET_VECTOR_ELEN_FP_32") (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32") @@ -3280,18 +3265,6 @@ (define_mode_iterator V_VLS_F_CONVERT_SI [ (RVVM2DF "TARGET_VECTOR_ELEN_FP_64") (RVVM1DF "TARGET_VECTOR_ELEN_FP_64") - (V1HF "riscv_vector::vls_mode_valid_p (V1HFmode) && TARGET_ZVFH") - (V2HF "riscv_vector::vls_mode_valid_p (V2HFmode) && TARGET_ZVFH") - (V4HF "riscv_vector::vls_mode_valid_p (V4HFmode) && TARGET_ZVFH") - (V8HF "riscv_vector::vls_mode_valid_p (V8HFmode) && TARGET_ZVFH") - (V16HF "riscv_vector::vls_mode_valid_p (V16HFmode) && TARGET_ZVFH") - (V32HF "riscv_vector::vls_mode_valid_p (V32HFmode) && TARGET_ZVFH && TARGET_MIN_VLEN >= 64") - (V64HF "riscv_vector::vls_mode_valid_p (V64HFmode) && TARGET_ZVFH && TARGET_MIN_VLEN >= 128") - (V128HF "riscv_vector::vls_mode_valid_p (V128HFmode) && TARGET_ZVFH && TARGET_MIN_VLEN >= 256") - (V256HF "riscv_vector::vls_mode_valid_p (V256HFmode) && TARGET_ZVFH && TARGET_MIN_VLEN >= 512") - (V512HF "riscv_vector::vls_mode_valid_p (V512HFmode) && TARGET_ZVFH && TARGET_MIN_VLEN >= 1024") - (V1024HF "riscv_vector::vls_mode_valid_p (V1024HFmode) && TARGET_ZVFH && TARGET_MIN_VLEN >= 2048") - (V1SF "riscv_vector::vls_mode_valid_p (V1SFmode) && TARGET_VECTOR_ELEN_FP_32") (V2SF "riscv_vector::vls_mode_valid_p (V2SFmode) && TARGET_VECTOR_ELEN_FP_32") (V4SF "riscv_vector::vls_mode_valid_p (V4SFmode) && TARGET_VECTOR_ELEN_FP_32") @@ -3317,19 +3290,12 @@ (define_mode_iterator V_VLS_F_CONVERT_SI [ ]) (define_mode_attr V_F2DI_CONVERT [ - (RVVM2HF "RVVM8DI") (RVVM1HF "RVVM4DI") (RVVMF2HF "RVVM2DI") - (RVVMF4HF "RVVM1DI") - (RVVM4SF "RVVM8DI") (RVVM2SF "RVVM4DI") (RVVM1SF "RVVM2DI") (RVVMF2SF "RVVM1DI") (RVVM8DF "RVVM8DI") (RVVM4DF "RVVM4DI") (RVVM2DF "RVVM2DI") (RVVM1DF "RVVM1DI") - (V1HF "V1DI") (V2HF "V2DI") (V
[PATCH v1] RISC-V: Support FP rint to i/l/ll diff size autovec
From: Pan Li This patch would like to support the FP below API auto vectorization with different type size +-+---+--+ | API | RV64 | RV32 | +-+---+--+ | irint | DF => SI | DF => SI | | irintf | - | -| | lrint | - | DF => SI | | lrintf | SF => DI | -| | llrint | - | -| | llrintf | SF => DI | SF => DI | +-+---+--+ Given below code: void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrintf (in[i]); } Before this patch: test_lrintf: beq a2,zero,.L8 sllia5,a2,32 srlia2,a5,30 add a4,a1,a2 .L3: flw fa5,0(a1) addia1,a1,4 addia0,a0,8 fcvt.l.s a5,fa5,dyn sd a5,-8(a0) bne a1,a4,.L3 After this patch: test_lrintf: beq a2,zero,.L8 sllia2,a2,32 srlia2,a2,32 .L3: vsetvli a5,a2,e32,mf2,ta,ma vle32.v v2,0(a1) sllia3,a5,2 sllia4,a5,3 vfwcvt.x.f.vv1,v2 sub a2,a2,a5 vse64.v v1,0(a0) add a1,a1,a3 add a0,a0,a4 bne a2,zero,.L3 Unfortunately, the HF mode is not include due to it requires additional middle-end support from internal-fun.def. gcc/ChangeLog: * config/riscv/autovec.md: Remove the size check of lrint. * config/riscv/riscv-v.cc (emit_vec_narrow_cvt_x_f): New help emit func impl. (emit_vec_widden_cvt_x_f): New help emit func impl. (emit_vec_rounding_to_integer): New func impl to emit the rounding from FP to integer. (expand_vec_lrint): Leverage emit_vec_rounding_to_integer. * config/riscv/vector.md: Take V_VLSF for vfncvt. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: * gcc.target/riscv/rvv/autovec/unop/math-irint-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-irintf-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llrintf-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lrintf-rv64-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lrintf-rv64-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-irint-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-llrintf-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lrint-rv32-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lrintf-rv64-0.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 6 +- gcc/config/riscv/riscv-v.cc | 46 +- gcc/config/riscv/vector.md| 2 +- .../riscv/rvv/autovec/unop/math-irint-1.c | 13 +++ .../riscv/rvv/autovec/unop/math-irint-run-0.c | 92 +-- .../rvv/autovec/unop/math-irintf-run-0.c | 63 + .../riscv/rvv/autovec/unop/math-llrintf-0.c | 13 +++ .../rvv/autovec/unop/math-llrintf-run-0.c | 63 + .../rvv/autovec/unop/math-lrint-rv32-0.c | 13 +++ .../rvv/autovec/unop/math-lrint-rv32-run-0.c | 63 + .../rvv/autovec/unop/math-lrintf-rv64-0.c | 13 +++ .../rvv/autovec/unop/math-lrintf-rv64-run-0.c | 63 + .../riscv/rvv/autovec/vls/math-irint-1.c | 30 ++ .../riscv/rvv/autovec/vls/math-llrintf-0.c| 30 ++ .../riscv/rvv/autovec/vls/math-lrint-rv32-0.c | 30 ++ .../rvv/autovec/vls/math-lrintf-rv64-0.c | 30 ++ 16 files changed, 514 insertions(+), 56 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irintf-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrintf-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrintf-rv64-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrintf-rv64-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llrintf-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lrint-rv32-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lrintf-rv64-0.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index cc4c9596bbf..f1f0523d1de 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/a
[PATCH v1] RISC-V: Adjust FP rint round tests for RV32
From: Pan Li The FP rint test cases for RV32 need some additional adjust for types and data. This patch would like to fix this which is missed in FP rint support PATCH for RV32 only by mistake. Please note the math-llrintf-run-0.c will trigger one ICE in the vsetvl pass in RV32 only. ./riscv32-unknown-elf-gcc -march=rv32gcv -mabi=ilp32d \ -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math \ gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c \ -o test.elf -lm Then there will have ICE similar as below, and will file bugzilla for it. config/riscv/riscv-v.cc:4314 65 | } | ^ 0x1fa5223 riscv_vector::validate_change_or_fail(rtx_def*, rtx_def**, rtx_def*, bool) /home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-v.cc:4314 0x1fb1aa2 pre_vsetvl::remove_avl_operand() /home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3342 0x1fb18c1 pre_vsetvl::cleaup() /home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3308 0x1fb216d pass_vsetvl::lazy_vsetvl() /home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3480 0x1fb2214 pass_vsetvl::execute(function*) /home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3504 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: Adjust test cases. * gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-run-0.c: Ditto. Signed-off-by: Pan Li --- .../riscv/rvv/autovec/unop/math-irint-run-0.c | 94 +- .../rvv/autovec/unop/math-llrintf-run-0.c | 98 ++- .../rvv/autovec/unop/math-lrint-rv32-run-0.c | 88 - 3 files changed, 141 insertions(+), 139 deletions(-) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c index 43bc0849695..aae1d95c2b6 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c @@ -5,59 +5,59 @@ #define ARRAY_SIZE 128 -float in[ARRAY_SIZE]; -long out[ARRAY_SIZE]; -long ref[ARRAY_SIZE]; +double in[ARRAY_SIZE]; +int out[ARRAY_SIZE]; +int ref[ARRAY_SIZE]; -TEST_UNARY_CALL_CVT (float, long, __builtin_lrintf) -TEST_ASSERT (long) +TEST_UNARY_CALL_CVT (double, int, __builtin_irint) +TEST_ASSERT (int) -TEST_INIT_CVT (float, 1.2, long, __builtin_lrintf (1.2), 1) -TEST_INIT_CVT (float, -1.2, long, __builtin_lrintf (-1.2), 2) -TEST_INIT_CVT (float, 0.5, long, __builtin_lrintf (0.5), 3) -TEST_INIT_CVT (float, -0.5, long, __builtin_lrintf (-0.5), 4) -TEST_INIT_CVT (float, 0.1, long, __builtin_lrintf (0.1), 5) -TEST_INIT_CVT (float, -0.1, long, __builtin_lrintf (-0.1), 6) -TEST_INIT_CVT (float, 3.0, long, __builtin_lrintf (3.0), 7) -TEST_INIT_CVT (float, -3.0, long, __builtin_lrintf (-3.0), 8) -TEST_INIT_CVT (float, 4503599627370495.5, long, __builtin_lrintf (4503599627370495.5), 9) -TEST_INIT_CVT (float, 4503599627370497.0, long, __builtin_lrintf (4503599627370497.0), 10) -TEST_INIT_CVT (float, -4503599627370495.5, long, __builtin_lrintf (-4503599627370495.5), 11) -TEST_INIT_CVT (float, -4503599627370496.0, long, __builtin_lrintf (-4503599627370496.0), 12) -TEST_INIT_CVT (float, 0.0, long, __builtin_lrintf (-0.0), 13) -TEST_INIT_CVT (float, -0.0, long, __builtin_lrintf (-0.0), 14) -TEST_INIT_CVT (float, 9223372036854774784.0, long, __builtin_lrintf (9223372036854774784.0), 15) -TEST_INIT_CVT (float, 9223372036854775808.0, long, __builtin_lrintf (9223372036854775808.0), 16) -TEST_INIT_CVT (float, -9223372036854775808.0, long, __builtin_lrintf (-9223372036854775808.0), 17) -TEST_INIT_CVT (float, -9223372036854777856.0, long, __builtin_lrintf (-9223372036854777856.0), 18) -TEST_INIT_CVT (float, __builtin_inf (), long, __builtin_lrintf (__builtin_inf ()), 19) -TEST_INIT_CVT (float, -__builtin_inf (), long, __builtin_lrintf (-__builtin_inf ()), 20) -TEST_INIT_CVT (float, __builtin_nan (""), long, 0x7fff, 21) +TEST_INIT_CVT (double, 1.2, int, __builtin_irint (1.2), 1) +TEST_INIT_CVT (double, -1.2, int, __builtin_irint (-1.2), 2) +TEST_INIT_CVT (double, 0.5, int, __builtin_irint (0.5), 3) +TEST_INIT_CVT (double, -0.5, int, __builtin_irint (-0.5), 4) +TEST_INIT_CVT (double, 0.1, int, __builtin_irint (0.1), 5) +TEST_INIT_CVT (double, -0.1, int, __builtin_irint (-0.1), 6) +TEST_INIT_CVT (double, 3.0, int, __builtin_irint (3.0), 7) +TEST_INIT_CVT (double, -3.0, int, __builtin_irint (-3.0), 8) +TEST_INIT_CVT (double, 4503599627370495.5, int, __builtin_irint (4503599627370495.5), 9) +TEST_INIT_CVT (double, 4503599627370497.0, int, __builtin_irint (4503599627370497.0), 10) +TEST_INIT_CVT (double, -4503599
[PATCH v1] RISC-V: Support FP round to i/l/ll diff size autovec
From: Pan Li This patch would like to support the FP below API auto vectorization with different type size +--+---+--+ | API | RV64 | RV32 | +--+---+--+ | iround | DF => SI | DF => SI | | iroundf | - | -| | lround | - | DF => SI | | lroundf | SF => DI | -| | llround | - | -| | llroundf | SF => DI | SF => DI | +--+---+--+ Given below code: void test_lroundf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lroundf (in[i]); } Before this patch: .L3: flw fa5,0(a1) addi a1,a1,4 addi a0,a0,8 fcvt.l.s a5,fa5,rmm sd a5,-8(a0) bne a4,a1,.L3 After this patch: fsrmi4 // RMM rounding mode vsetivli zero,16,e32,m4,ta,ma .L4: vle32.v v4,0(a5) addi a5,a5,64 vfwcvt.x.f.v v8,v4 vse64.v v8,0(a4) addi a4,a4,128 bne a3,a5,.L4 andi a5,a2,15 andi a4,a2,-16 beq a5,zero,.L16 Unfortunately, the HF mode is not include due to it requires additional middle-end support from internal-fun.def. gcc/ChangeLog: * config/riscv/autovec.md: Remove the size check of lround. * config/riscv/riscv-v.cc (expand_vec_lround): Leverage emit_vec_rounding_to_integer for round. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-iround-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-iround-run-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llroundf-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llroundf-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-iround-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-llroundf-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lround-rv32-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lroundf-rv64-0.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 6 +- gcc/config/riscv/riscv-v.cc | 8 +- .../riscv/rvv/autovec/unop/math-iround-1.c| 18 .../rvv/autovec/unop/math-iround-run-1.c | 83 ++ .../riscv/rvv/autovec/unop/math-llroundf-0.c | 19 + .../rvv/autovec/unop/math-llroundf-run-0.c| 84 +++ .../rvv/autovec/unop/math-lround-rv32-0.c | 18 .../rvv/autovec/unop/math-lround-rv32-run-0.c | 83 ++ .../rvv/autovec/unop/math-lroundf-rv64-0.c| 18 .../autovec/unop/math-lroundf-rv64-run-0.c| 84 +++ .../riscv/rvv/autovec/vls/math-iround-1.c | 27 ++ .../riscv/rvv/autovec/vls/math-llroundf-0.c | 27 ++ .../rvv/autovec/vls/math-lround-rv32-0.c | 27 ++ .../rvv/autovec/vls/math-lroundf-rv64-0.c | 27 ++ 14 files changed, 520 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llroundf-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llroundf-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-iround-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llroundf-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lround-rv32-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lroundf-rv64-0.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index f1f0523d1de..d1804d82552 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2420,8 +2420,7 @@ (define_expand "lrint2" (define_expand "lround2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math -&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (mode))" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" { riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); DONE; @@ -24
[PATCH v1] RISC-V: Cleanup unused code in riscv_v_adjust_bytesize [NFC]
From: Pan Li Cleanup mode_size related code which is not used anymore. Below tests are passed for this patch. * The RVV fully regresssion test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_adjust_bytesize): Cleanup unused mode_size related code. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 4 1 file changed, 4 deletions(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 56cd8d2c23f..691d967de29 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -1527,10 +1527,6 @@ riscv_v_adjust_bytesize (machine_mode mode, int scale) return BYTES_PER_RISCV_VECTOR; poly_int64 nunits = GET_MODE_NUNITS (mode); - poly_int64 mode_size = GET_MODE_SIZE (mode); - - if (maybe_eq (mode_size, (uint16_t) -1)) - mode_size = riscv_vector_chunks * scale; if (nunits.coeffs[0] > 8) return exact_div (nunits, 8); -- 2.34.1
[PATCH v1] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV
From: Pan Li This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only if the mrvv-vector-bits=zvl, the only one args should be the integer constant and its' value is terminated by the LMUL and the vector register bits in zvl*b. For example: typedef vint32m2_t fixed_vint32m2_t __attribute__((riscv_rvv_vector_bits(128))); The above type define is invalid when -march=rv64gc_zve64d_zvl64b (aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when -march=rv64gcv_zvl128b similar to below. "error: invalid RVV vector size '128', expected size is '256' based on LMUL of type and '-mrvv-vector-bits=zvl'" For the vint*m*_t below operations are allowed. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, - For the vfloat*m*_t below operations are allowed. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, *, /, - For the vbool*_t types only below operations are allowed except the CMP and ALU. The CMP and ALU operations on vbool*_t is not well defined currently. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. For the vint*x*m*_t tuple types are not suppored in this patch which is compatible with clang. This patch passed the below testsuites. * The riscv fully regression tests. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute): New static func to take care of the RVV types decorated by the attributes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 88 +- .../riscv/rvv/base/riscv_rvv_vector_bits-1.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-10.c | 53 + .../riscv/rvv/base/riscv_rvv_vector_bits-11.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits-12.c | 14 +++ .../riscv/rvv/base/riscv_rvv_vector_bits-2.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-3.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-4.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-5.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-6.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-7.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits-8.c | 75 .../riscv/rvv/base/riscv_rvv_vector_bits-9.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits.h| 108 ++ 14 files changed, 600 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 56cd8d2c23f..fdbaf1633ac 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/r
[PATCH v2] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV
From: Pan Li Update in v2: * Cleanup some unused code. * Fix some typo of commit log. Original log: This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only if the mrvv-vector-bits=zvl, the only one args should be the integer constant and its' value is terminated by the LMUL and the vector register bits in zvl*b. For example: typedef vint32m2_t fixed_vint32m2_t __attribute__((riscv_rvv_vector_bits(128))); The above type define is valid when -march=rv64gc_zve64d_zvl64b (aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when -march=rv64gcv_zvl128b similar to below. "error: invalid RVV vector size '128', expected size is '256' based on LMUL of type and '-mrvv-vector-bits=zvl'" For the vint*m*_t below operations are allowed. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, - For the vfloat*m*_t below operations are allowed. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, *, /, - For the vbool*_t types only below operations are allowed except the CMP and ALU. The CMP and ALU operations on vbool*_t is not well defined currently. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. For the vint*x*m*_t tuple types are not suppored in this patch which is compatible with clang. This patch passed the below testsuites. * The riscv fully regression tests. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute): New static func to take care of the RVV types decorated by the attributes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 87 +- .../riscv/rvv/base/riscv_rvv_vector_bits-1.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-10.c | 53 + .../riscv/rvv/base/riscv_rvv_vector_bits-11.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits-12.c | 14 +++ .../riscv/rvv/base/riscv_rvv_vector_bits-2.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-3.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-4.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-5.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-6.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-7.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits-8.c | 75 .../riscv/rvv/base/riscv_rvv_vector_bits-9.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits.h| 108 ++ 14 files changed, 599 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc i
[PATCH v1] VECT: Bugfix ICE for vectorizable_store when both len and mask
From: Pan Li This patch would like to fix one ICE in vectorizable_store for both the loop_masks and loop_lens. The ICE looks like below with "-march=rv64gcv -O3". during GIMPLE pass: vect test.c: In function ‘d’: test.c:6:6: internal compiler error: in vectorizable_store, at tree-vect-stmts.cc:8691 6 | void d() { | ^ 0x37a6f2f vectorizable_store .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691 0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*, _slp_tree*, _slp_instance*, vec*) .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242 0x1db5dca vect_analyze_loop_operations .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208 0x1db885b vect_analyze_loop_2 .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041 0x1dba029 vect_analyze_loop_1 .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481 0x1dbabad vect_analyze_loop(loop*, vec_info_shared*) .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639 0x1e389d1 try_vectorize_loop_1 .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066 0x1e38f3d try_vectorize_loop .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182 0x1e39230 execute .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298 Given the masks and the lens cannot be enabled simultanously when loop is using partial vectors. Thus, we need to ensure the one is disabled when we would like to record the other in check_load_store_for_partial_vectors. For example, when we try to record loop len, we need to check if the loop mask is disabled or not. Below testsuites are passed for this patch: * The x86 bootstrap tests. * The x86 fully regression tests. * The aarch64 fully regression tests. * The riscv fully regressison tests. PR target/114195 gcc/ChangeLog: * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Add loop mask/len check before recording as they are mutual exclusion. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr114195-1.c: New test. Signed-off-by: Pan Li --- .../gcc.target/riscv/rvv/base/pr114195-1.c| 15 +++ gcc/tree-vect-stmts.cc| 26 ++- 2 files changed, 35 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c new file mode 100644 index 000..b0c9d5b81b8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c @@ -0,0 +1,15 @@ +/* Test that we do not have ice when compile */ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */ + +long a, b; +extern short c[]; + +void d() { + for (int e = 0; e < 35; e += 2) { +a = ({ a < 0 ? a : 0; }); +b = ({ b < 0 ? b : 0; }); + +c[e] = 0; + } +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 14a3ffb5f02..624947ed271 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1502,6 +1502,8 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, gather_scatter_info *gs_info, tree scalar_mask) { + gcc_assert (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)); + /* Invariant loads need no special support. */ if (memory_access_type == VMAT_INVARIANT) return; @@ -1521,9 +1523,17 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, internal_fn ifn = (is_load ? vect_load_lanes_supported (vectype, group_size, true) : vect_store_lanes_supported (vectype, group_size, true)); - if (ifn == IFN_MASK_LEN_LOAD_LANES || ifn == IFN_MASK_LEN_STORE_LANES) + + /* When the loop_vinfo using partial vector, we cannot enable both +the fully mask and length simultaneously. Thus, make sure the +other one is disabled when record one of them. +The same as other place for both the vect_record_loop_len and +vect_record_loop_mask. */ + if ((ifn == IFN_MASK_LEN_LOAD_LANES || ifn == IFN_MASK_LEN_STORE_LANES) + && !LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) vect_record_loop_len (loop_vinfo, lens, nvectors, vectype, 1); - else if (ifn == IFN_MASK_LOAD_LANES || ifn == IFN_MASK_STORE_LANES) + else if ((ifn == IFN_MASK_LOAD_LANES || ifn == IFN_MASK_STORE_LANES) + && !LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype, scalar_mask); else @@ -1549,12 +1559,14 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, if (internal_gather_scatter_fn_supported_p (len_ifn, vectype, gs_info->memory_type, gs_info->offset_vectype, -
[PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled
From: Pan Li This patch would like to fix one ICE in vectorizable_store when both the loop_masks and loop_lens are enabled. The ICE looks like below when build with "-march=rv64gcv -O3". during GIMPLE pass: vect test.c: In function ‘d’: test.c:6:6: internal compiler error: in vectorizable_store, at tree-vect-stmts.cc:8691 6 | void d() { | ^ 0x37a6f2f vectorizable_store .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691 0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*, _slp_tree*, _slp_instance*, vec*) .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242 0x1db5dca vect_analyze_loop_operations .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208 0x1db885b vect_analyze_loop_2 .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041 0x1dba029 vect_analyze_loop_1 .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481 0x1dbabad vect_analyze_loop(loop*, vec_info_shared*) .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639 0x1e389d1 try_vectorize_loop_1 .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066 0x1e38f3d try_vectorize_loop .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182 0x1e39230 execute .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298 There are two ways to reach vectorizer LD/ST, one is the analysis and the other is transform. We cannot have both the lens and the masks enabled during transform but it is valid during analysis. Given the transform doesn't required cost_vec, we can only enable the assert based on cost_vec is NULL or not. Below testsuites are passed for this patch: * The x86 bootstrap tests. * The x86 fully regression tests. * The aarch64 fully regression tests. * The riscv fully regressison tests. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Enable the assert during transform process. (vectorizable_load): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr114195-1.c: New test. Signed-off-by: Pan Li --- .../gcc.target/riscv/rvv/base/pr114195-1.c | 15 +++ gcc/tree-vect-stmts.cc | 18 ++ 2 files changed, 29 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c new file mode 100644 index 000..a67b847112b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c @@ -0,0 +1,15 @@ +/* Test that we do not have ice when compile */ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */ + +long a, b; +extern short c[]; + +void d() { + for (int e = 0; e < 35; e = 2) { +a = ({ a < 0 ? a : 0; }); +b = ({ b < 0 ? b : 0; }); + +c[e] = 0; + } +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 14a3ffb5f02..e8617439a48 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8697,8 +8697,13 @@ vectorizable_store (vec_info *vinfo, ? &LOOP_VINFO_LENS (loop_vinfo) : NULL); - /* Shouldn't go with length-based approach if fully masked. */ - gcc_assert (!loop_lens || !loop_masks); + /* The vect_transform_stmt and vect_analyze_stmt will go here but there + are some difference here. We cannot enable both the lens and masks + during transform but it is allowed during analysis. + Shouldn't go with length-based approach if fully masked. */ + if (cost_vec == NULL) +/* The cost_vec is NULL during transfrom. */ +gcc_assert ((!loop_lens || !loop_masks)); /* Targets with store-lane instructions must not require explicit realignment. vect_supportable_dr_alignment always returns either @@ -10577,8 +10582,13 @@ vectorizable_load (vec_info *vinfo, ? &LOOP_VINFO_LENS (loop_vinfo) : NULL); - /* Shouldn't go with length-based approach if fully masked. */ - gcc_assert (!loop_lens || !loop_masks); + /* The vect_transform_stmt and vect_analyze_stmt will go here but there + are some difference here. We cannot enable both the lens and masks + during transform but it is allowed during analysis. + Shouldn't go with length-based approach if fully masked. */ + if (cost_vec == NULL) +/* The cost_vec is NULL during transfrom. */ +gcc_assert ((!loop_lens || !loop_masks)); /* Targets with store-lane instructions must not require explicit realignment. vect_supportable_dr_alignment always returns either -- 2.34.1
[PATCH v3] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV
From: Pan Li Update in v3: * Add pre-defined __riscv_v_fixed_vlen when zvl. Update in v2: * Cleanup some unused code. * Fix some typo of commit log. Original log: This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only if the mrvv-vector-bits=zvl, the only one args should be the integer constant and its' value is terminated by the LMUL and the vector register bits in zvl*b. For example: typedef vint32m2_t fixed_vint32m2_t __attribute__((riscv_rvv_vector_bits(128))); The above type define is valid when -march=rv64gc_zve64d_zvl64b (aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when -march=rv64gcv_zvl128b similar to below. "error: invalid RVV vector size '128', expected size is '256' based on LMUL of type and '-mrvv-vector-bits=zvl'" Meanwhile, a pre-define macro __riscv_v_fixed_vlen is introduced to represent the fixed vlen in a RVV vector register. For the vint*m*_t below operations are allowed. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, - For the vfloat*m*_t below operations are allowed. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, *, /, - For the vbool*_t types only below operations are allowed except the CMP and ALU. The CMP and ALU operations on vbool*_t is not well defined currently. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. For the vint*x*m*_t tuple types are not suppored in this patch which is compatible with clang. This patch passed the below testsuites. * The riscv fully regression tests. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add pre-define macro __riscv_v_fixed_vlen when zvl. * config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute): New static func to take care of the RVV types decorated by the attributes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-13.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-14.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-15.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-16.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-17.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-c.cc | 3 + gcc/config/riscv/riscv.cc | 87 +- .../riscv/rvv/base/riscv_rvv_vector_bits-1.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-10.c | 53 + .../riscv/rvv/base/riscv_rvv_vector_bits-11.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits-12.c | 14 +++ .../riscv/rvv/base/riscv_rvv_vector_bits-13.c | 10 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-14.c | 10 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-15.c | 10 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-16.c | 11 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-17.c | 10 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-2.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-3.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-4.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-5.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-6.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-7.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits-8.c | 75 .../riscv/rvv/base/riscv_rvv_vector_bits-9.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits.h| 108 ++ 20 files changed, 653 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c create mode 1006
[PATCH v1] RISC-V: Fix some code style issue(s) in riscv-c.cc [NFC]
From: Pan Li Notice some code style issue(s) when add __riscv_v_fixed_vlen, includes: * Meanless empty line. * Line greater than 80 chars. * Indent with 3 space(s). * Argument unalignment. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_ext_version_value): Fix code style greater than 80 chars. (riscv_cpu_cpp_builtins): Fix useless empty line, indent with 3 space(s) and argument unalignment. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-c.cc | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc index 3755ec0b8ef..7029ba88186 100644 --- a/gcc/config/riscv/riscv-c.cc +++ b/gcc/config/riscv/riscv-c.cc @@ -37,7 +37,8 @@ along with GCC; see the file COPYING3. If not see static int riscv_ext_version_value (unsigned major, unsigned minor) { - return (major * RISCV_MAJOR_VERSION_BASE) + (minor * RISCV_MINOR_VERSION_BASE); + return (major * RISCV_MAJOR_VERSION_BASE) ++ (minor * RISCV_MINOR_VERSION_BASE); } /* Implement TARGET_CPU_CPP_BUILTINS. */ @@ -110,7 +111,6 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile) case CM_MEDANY: builtin_define ("__riscv_cmodel_medany"); break; - } if (riscv_user_wants_strict_align) @@ -142,9 +142,9 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile) riscv_ext_version_value (0, 12)); } - if (TARGET_XTHEADVECTOR) - builtin_define_with_int_value ("__riscv_th_v_intrinsic", -riscv_ext_version_value (0, 11)); + if (TARGET_XTHEADVECTOR) +builtin_define_with_int_value ("__riscv_th_v_intrinsic", + riscv_ext_version_value (0, 11)); /* Define architecture extension test macros. */ builtin_define_with_int_value ("__riscv_arch_test", 1); -- 2.34.1
[PATCH v1] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))
From: Pan Li This patch would like to fix one ICE for __attribute__((target("arch=+v")) and likewise extension(s). Given we have sample code as below: void __attribute__((target("arch=+v"))) test_2 (int *a, int *b, int *out, unsigned count) { unsigned i; for (i = 0; i < count; i++) out[i] = a[i] + b[i]; } It will have ICE when build with -march=rv64gc -O3. test.c: In function ‘test_2’: test.c:4:1: internal compiler error: Floating point exception 4 | { | ^ 0x1a5891b crash_signal .../__RISC-V_BUILD__/../gcc/toplev.cc:319 0x7f0a7884251f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x1f51ba4 riscv_hard_regno_nregs .../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:8143 0x1967bb9 init_reg_modes_target() .../__RISC-V_BUILD__/../gcc/reginfo.cc:471 0x13fc029 init_emit_regs() .../__RISC-V_BUILD__/../gcc/emit-rtl.cc:6237 0x1a5b83d target_reinit() .../__RISC-V_BUILD__/../gcc/toplev.cc:1936 0x35e374d save_target_globals() .../__RISC-V_BUILD__/../gcc/target-globals.cc:92 0x35e381f save_target_globals_default_opts() .../__RISC-V_BUILD__/../gcc/target-globals.cc:122 0x1f544cc riscv_save_restore_target_globals(tree_node*) .../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:9138 0x1f55c36 riscv_set_current_function ... There are two reasons for this ICE. 1. The implied extension(s) of v are not well handled and the TARGET_MIN_VLEN is 0 which is not reinitialized. Then the size / TARGET_MIN_VLEN will have DivideByZero. 2. The machine modes of the vector types will be vary after the v extension is introduced. This patch passed below testsuite: 1. The riscv fully regression test. PR target/114352 gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_subset_list::parse_single_ext): Add implied, combine and conflict check after parse single extension. * config/riscv/riscv.cc (riscv_set_current_function): Reini the machine mode before when set cur function. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr114352-1.c: New test. * gcc.target/riscv/rvv/base/pr114352-2.c: New test. Signed-off-by: Pan Li --- gcc/common/config/riscv/riscv-common.cc | 33 --- gcc/config/riscv/riscv.cc | 4 ++ .../gcc.target/riscv/rvv/base/pr114352-1.c| 58 +++ .../gcc.target/riscv/rvv/base/pr114352-2.c| 27 + 4 files changed, 115 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-2.c diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index 48efef40dfd..d32bf147eca 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -1375,20 +1375,39 @@ riscv_subset_list::parse_single_multiletter_ext (const char *p, const char * riscv_subset_list::parse_single_ext (const char *p, bool exact_single_p) { + const char *end_of_ext; + switch (p[0]) { case 'x': - return parse_single_multiletter_ext (p, "x", "non-standard extension", - exact_single_p); + end_of_ext = parse_single_multiletter_ext (p, "x", +"non-standard extension", +exact_single_p); + break; case 'z': - return parse_single_multiletter_ext (p, "z", "sub-extension", - exact_single_p); + end_of_ext = parse_single_multiletter_ext (p, "z", "sub-extension", +exact_single_p); + break; case 's': - return parse_single_multiletter_ext (p, "s", "supervisor extension", - exact_single_p); + end_of_ext = parse_single_multiletter_ext (p, "s", "supervisor extension", +exact_single_p); + break; default: - return parse_single_std_ext (p, exact_single_p); + end_of_ext = parse_single_std_ext (p, exact_single_p); + break; } + + /* Make sure the implied or combined extension is included after add + a new std extension to subset list. For exmaple as below, + + void __attribute__((target("arch=+v"))) func () with -march=rv64gc. + + The implied zvl128b and zve64d of the std v should be included. */ + handle_implied_ext (p); + handle_combine_ext (); + check_conflict_ext (); + + return end_of_ext; } /* Parsing arch string to subset list, return NULL if parsing failed. */ diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 680c4a728e9..89acb94af10 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -9474,6 +9474,10 @@ riscv_set_current_function (tree decl) cl_target_option_restore (&global_op
[PATCH v1] RISC-V: Bugfix function target attribute pollution
From: Pan Li This patch depends on below ICE fix. https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647915.html The function target attribute should be on a per-function basis. For example, we have 3 function as below: void test_1 () {} void __attribute__((target("arch=+v"))) test_2 () {} void __attribute__((target("arch=+zfh"))) test_3 () {} void test_4 () {} The scope of the target attribute should not extend the function body. Aka, test_3 cannot have the 'v' extension, as well as the test_4 cannot have both the 'v' and 'zfh' extension. Unfortunately, for now the test_4 is able to leverage the 'v' and the 'zfh' extension which is incorrect. This patch would like to fix the sticking attribute by introduce the commandline subset_list. When parse_arch, we always clone from the cmdline_subset_list instead of the current_subset_list. Meanwhile, we correct the print information about arch like below. .option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zbb1p0 The riscv_declare_function_name hook is always after the hook riscv_process_target_attr. Thus, we introduce one hash_map to record the 1:1 mapping from fndel to its' subset_list in advance. And later the riscv_declare_function_name is able to get the right information about the arch. Below test are passed for this patch * The riscv fully regression test. PR target/114352 gcc/ChangeLog: * common/config/riscv/riscv-common.cc (struct riscv_func_target_info): New struct for func decl and target name. (struct riscv_func_target_hasher): New hasher for hash table mapping from the fn_decl to fn_target_name. (riscv_func_decl_hash): New func to compute the hash for fn_decl. (riscv_func_target_hasher::hash): New func to impl hash interface. (riscv_func_target_hasher::equal): New func to impl equal interface. (riscv_cmdline_subset_list): New static var for cmdline subset list. (riscv_func_target_table_lazy_init): New func to lazy init the func target hash table. (riscv_func_target_get): New func to get target name from hash table. (riscv_func_target_put): New func to put target name into hash table. (riscv_func_target_remove_and_destory): New func to remove target info from the hash table and destory it. (riscv_parse_arch_string): Set the static var cmdline_subset_list. * config/riscv/riscv-subset.h (riscv_cmdline_subset_list): New static var for cmdline subset list. (riscv_func_target_get): New func decl. (riscv_func_target_put): Ditto. (riscv_func_target_remove_and_destory): Ditto. * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Take cmdline_subset_list instead of current_subset_list when clone. (riscv_process_target_attr): Record the func target info to hash table. (riscv_option_valid_attribute_p): Add new arg tree fndel. * config/riscv/riscv.cc (riscv_declare_function_name): Consume the func target info and print the arch message. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr114352-3.c: New test. Signed-off-by: Pan Li --- gcc/common/config/riscv/riscv-common.cc | 105 +++- gcc/config/riscv/riscv-subset.h | 4 + gcc/config/riscv/riscv-target-attr.cc | 18 ++- gcc/config/riscv/riscv.cc | 7 +- .../gcc.target/riscv/rvv/base/pr114352-3.c| 113 ++ 5 files changed, 240 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-3.c diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index d32bf147eca..76ec9bf846c 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -425,11 +425,108 @@ bool riscv_subset_list::parse_failed = false; static riscv_subset_list *current_subset_list = NULL; +static riscv_subset_list *cmdline_subset_list = NULL; + +struct riscv_func_target_info +{ + tree fn_decl; + std::string fn_target_name; + + riscv_func_target_info (const tree &decl, const std::string &target_name) +: fn_decl (decl), fn_target_name (target_name) + { + } +}; + +struct riscv_func_target_hasher : nofree_ptr_hash +{ + typedef tree compare_type; + + static hashval_t hash (value_type); + static bool equal (value_type, const compare_type &); +}; + +static hash_table *func_target_table = NULL; + +static inline hashval_t riscv_func_decl_hash (tree fn_decl) +{ + inchash::hash h; + + h.add_ptr (fn_decl); + + return h.end (); +} + +inline hashval_t +riscv_func_target_hasher::hash (value_type value) +{ + return riscv_func_decl_hash (value->fn_decl); +} + +inline bool +riscv_func_target_hasher::equal (value_type value, const compare_type &key) +{ + return value->fn_decl == key; +} + const riscv_subset_list *riscv_current_subset_list () { return c
[PATCH v2] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))
From: Pan Li This patch would like to fix one ICE for __attribute__((target("arch=+v")) and likewise extension(s). Given we have sample code as below: void __attribute__((target("arch=+v"))) test_2 (int *a, int *b, int *out, unsigned count) { unsigned i; for (i = 0; i < count; i++) out[i] = a[i] + b[i]; } It will have ICE when build with -march=rv64gc -O3. test.c: In function ‘test_2’: test.c:4:1: internal compiler error: Floating point exception 4 | { | ^ 0x1a5891b crash_signal .../__RISC-V_BUILD__/../gcc/toplev.cc:319 0x7f0a7884251f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x1f51ba4 riscv_hard_regno_nregs .../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:8143 0x1967bb9 init_reg_modes_target() .../__RISC-V_BUILD__/../gcc/reginfo.cc:471 0x13fc029 init_emit_regs() .../__RISC-V_BUILD__/../gcc/emit-rtl.cc:6237 0x1a5b83d target_reinit() .../__RISC-V_BUILD__/../gcc/toplev.cc:1936 0x35e374d save_target_globals() .../__RISC-V_BUILD__/../gcc/target-globals.cc:92 0x35e381f save_target_globals_default_opts() .../__RISC-V_BUILD__/../gcc/target-globals.cc:122 0x1f544cc riscv_save_restore_target_globals(tree_node*) .../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:9138 0x1f55c36 riscv_set_current_function ... There are two reasons for this ICE. 1. The implied extension(s) of v are not well handled and the TARGET_MIN_VLEN is 0 which is not reinitialized. Then the size / TARGET_MIN_VLEN will have DivideByZero. 2. The machine modes of the vector types will be vary after the v extension is introduced. This patch passed below testsuite: 1. The riscv fully regression test. PR target/114352 gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_subset_list::parse): Replace implied, combine and check to func finalize. (riscv_subset_list::finalize): New func impl to take care of implied, combine ext and related checks. * config/riscv/riscv-subset.h: Add func decl for finalize. * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Finalize the ext before return succeed. * config/riscv/riscv.cc (riscv_set_current_function): Reinit the machine mode before when set cur function. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr114352-1.c: New test. * gcc.target/riscv/rvv/base/pr114352-2.c: New test. Signed-off-by: Pan Li --- gcc/common/config/riscv/riscv-common.cc | 31 ++ gcc/config/riscv/riscv-subset.h | 2 + gcc/config/riscv/riscv-target-attr.cc | 2 + gcc/config/riscv/riscv.cc | 4 ++ .../gcc.target/riscv/rvv/base/pr114352-1.c| 58 +++ .../gcc.target/riscv/rvv/base/pr114352-2.c| 27 + 6 files changed, 114 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-2.c diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index 440127a2af0..15d44245b3c 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -1428,16 +1428,7 @@ riscv_subset_list::parse (const char *arch, location_t loc) if (p == NULL) goto fail; - for (itr = subset_list->m_head; itr != NULL; itr = itr->next) -{ - subset_list->handle_implied_ext (itr->name.c_str ()); -} - - /* Make sure all implied extensions are included. */ - gcc_assert (subset_list->check_implied_ext ()); - - subset_list->handle_combine_ext (); - subset_list->check_conflict_ext (); + subset_list->finalize (); return subset_list; @@ -1467,6 +1458,26 @@ riscv_subset_list::set_loc (location_t loc) m_loc = loc; } +/* Make sure the implied or combined extension is included after add + a new std extension to subset list or likewise. For exmaple as below, + + void __attribute__((target("arch=+v"))) func () with -march=rv64gc. + + The implied zvl128b and zve64d of the std v should be included. */ +void +riscv_subset_list::finalize () +{ + riscv_subset_t *subset; + + for (subset = m_head; subset != NULL; subset = subset->next) +handle_implied_ext (subset->name.c_str ()); + + gcc_assert (check_implied_ext ()); + + handle_combine_ext (); + check_conflict_ext (); +} + /* Return the current arch string. */ std::string diff --git a/gcc/config/riscv/riscv-subset.h b/gcc/config/riscv/riscv-subset.h index ae849e2a302..ec979040e8c 100644 --- a/gcc/config/riscv/riscv-subset.h +++ b/gcc/config/riscv/riscv-subset.h @@ -105,6 +105,8 @@ public: int match_score (riscv_subset_list *) const; void set_loc (location_t); + + void finalize (); }; extern const riscv_subset_list *riscv_current_subset_list (void); diff --git a/gcc/config/riscv/riscv-target-attr.cc b/gcc/config/riscv/riscv-target-attr.cc index
[PATCH v4] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV
From: Pan Li This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only if the mrvv-vector-bits=zvl, the only one args should be the integer constant and its' value is terminated by the LMUL and the vector register bits in zvl*b. For example: typedef vint32m2_t fixed_vint32m2_t __attribute__((riscv_rvv_vector_bits(128))); The above type define is valid when -march=rv64gc_zve64d_zvl64b (aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when -march=rv64gcv_zvl128b similar to below. "error: invalid RVV vector size '128', expected size is '256' based on LMUL of type and '-mrvv-vector-bits=zvl'" Meanwhile, a pre-define macro __riscv_v_fixed_vlen is introduced to represent the fixed vlen in a RVV vector register. For the vint*m*_t below operations are allowed. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, - The CMP will return vint*m*_t the same as aarch64 sve. For example: typedef vint32m1_t fixed_vint32m1_t __attribute__((riscv_rvv_vector_bits(128))); fixed_vint32m1_t less_than (fixed_vint32m1_t a, fixed_vint32m1_t b) { return a < b; } For the vfloat*m*_t below operations are allowed. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, *, /, - The CMP will return vfloat*m*_t the same as aarch64 sve. For example: typedef vfloat32m1_t fixed_vfloat32m1_t __attribute__((riscv_rvv_vector_bits(128))); fixed_vfloat32m1_t less_than (fixed_vfloat32m1_t a, fixed_vfloat32m1_t b) { return a < b; } For the vbool*_t types only below operations are allowed except the CMP and ALU. The CMP and ALU operations on vbool*_t is not well defined currently. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. For the vint*x*m*_t tuple types are not suppored in this patch which is compatible with clang. This patch passed the below testsuites. * The riscv fully regression tests. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add pre-define macro __riscv_v_fixed_vlen when zvl. * config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute): New static func to take care of the RVV types decorated by the attributes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-13.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-14.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-15.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-16.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-17.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-18.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-c.cc | 3 + gcc/config/riscv/riscv.cc | 87 +- .../riscv/rvv/base/riscv_rvv_vector_bits-1.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-10.c | 53 + .../riscv/rvv/base/riscv_rvv_vector_bits-11.c | 76 .../riscv/rvv/base/riscv_rvv_vector_bits-12.c | 14 +++ .../riscv/rvv/base/riscv_rvv_vector_bits-13.c | 10 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-14.c | 10 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-15.c | 10 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-16.c | 11 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-17.c | 10 ++ .../riscv/rvv/base/riscv_rvv_vector_bits-18.c | 45 .../riscv/rvv/base/riscv_rvv_vector_bits-2.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-3.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-4.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-5.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-6.c | 6 + .../riscv/rvv/base/riscv_rvv_vector_bits-7
[PATCH v1] RISC-V: Allow RVV intrinsic when function target("arch=+v")
From: Pan Li This patch would like to allow the RVV intrinsic when function is attributed as target("arch=+v") and build with rv64gc. For example: vint32m1_t __attribute__((target("arch=+v"))) test_1 (vint32m1_t a, vint32m1_t b, size_t vl) { return __riscv_vadd_vv_i32m1 (a, b, vl); } build with -march=rv64gc -mabi=lp64d -O3, we will have asm like below: test_1: .option push .option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_\ zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0 vsetvli zero,a0,e32,m1,ta,ma vadd.vv v8,v8,v9 ret The riscv_vector.h must be included when leverage intrinisc type(s) and API(s). And the scope of this attribute should not excced the function body. Meanwhile, to make rvv types and API(s) available for this attribute, include riscv_vector.h will not report error for now if v is not present in march. Below test are passed for this patch: * The riscv fully regression test. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_pragma_intrinsic): Remove error when V is disabled and init the RVV types and intrinic APIs. * config/riscv/riscv-vector-builtins.cc (expand_builtin): Report error if V ext is disabled. * config/riscv/riscv.cc (riscv_return_value_is_vector_type_p): Ditto. (riscv_arguments_is_vector_type_p): Ditto. (riscv_vector_cc_function_p): Ditto. * config/riscv/riscv_vector.h: Remove error if V is disable. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pragma-1.c: Remove. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-1.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-2.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-4.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-5.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-6.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-c.cc | 18 +++ gcc/config/riscv/riscv-vector-builtins.cc | 5 gcc/config/riscv/riscv.cc | 30 --- gcc/config/riscv/riscv_vector.h | 4 --- .../gcc.target/riscv/rvv/base/pragma-1.c | 4 --- .../target_attribute_v_with_intrinsic-1.c | 5 .../target_attribute_v_with_intrinsic-2.c | 18 +++ .../target_attribute_v_with_intrinsic-3.c | 13 .../target_attribute_v_with_intrinsic-4.c | 10 +++ .../target_attribute_v_with_intrinsic-5.c | 12 .../target_attribute_v_with_intrinsic-6.c | 12 .../target_attribute_v_with_intrinsic-7.c | 9 ++ .../target_attribute_v_with_intrinsic-8.c | 23 ++ 13 files changed, 145 insertions(+), 18 deletions(-) delete mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pragma-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc index edb866d51e4..01314037461 100644 --- a/gcc/config/riscv/riscv-c.cc +++ b/gcc/config/riscv/riscv-c.cc @@ -201,14 +201,20 @@ riscv_pragma_intrinsic (cpp_reader *) if (strcmp (name, "vector") == 0 || strcmp (name, "xtheadvector") == 0) { - if (!TARGET_VECTOR) + if (TARGET_VECTOR) + riscv_vector::handle_pragma_vector (); + else /* Indicates riscv_vector.h is included but v is missing in arch */ { - error ("%<#pragma riscv intrinsic%> option %qs needs 'V' or " -"'XTHEADVECTOR' extension enabled", -name); - return; + /* To make the the rvv types and intrinsic API available for the +target("arch=+v") attribute, we need to temporally enable the +TARGET_VECTOR, and disable it after all initialized. */ + target_flags |= MASK_VECTOR; + + riscv_vector::init_builtins ();
[PATCH v1] RISC-V: Allow RVV intrinsic for more function target
From: Pan Li In previous, we allowed the target(("arch=+v")) for a function with rv64gc build. This patch would like to support more arch options as below: * zve32x * zve32f * zve64x * zve64f * zve64d * zvfhmin * zvfh For example, we have sample code as below. vfloat32m1_t __attribute__((target("arch=+zve64f"))) test_9 (vfloat32m1_t a, vfloat32m1_t b, size_t vl) { return __riscv_vfadd_vv_f32m1 (a, b, vl); } It will generate the asm code when build with -O3 -march=rv64gc test_9: vsetvli zero,a0,e32,m1,ta,ma vfadd.vvv8,v8,v9 ret Meanwhile, this patch introduces more error handling for the target attribute. Take arch=+zve32x with vfloat32m1_t will have error message "'vfloat32m1_t' requires the zve32f, zve64f or zve64d ISA extension". And take arch=+zve32f with vfloat16m1_t will have error message "'vfloat16m1_t' requires the zvfhmin or zvfh ISA extension". Below test are passed for this patch: * The riscv fully regression test. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_pragma_intrinsic): Add INT and FP vector element flags, invoke override option and mode adjust. * config/riscv/riscv-protos.h (riscv_option_override): New extern func decl. * config/riscv/riscv-vector-builtins.cc (expand_builtin): Return target rtx after error_at. * config/riscv/riscv.cc (riscv_vector_int_type_p): New predicate func to tell one tree type is integer or not. (riscv_vector_float_type_p): New predicate func to tell one tree type is float or not. (riscv_vector_element_bitsize): New func to get the element bitsize of a vector tree type. (riscv_validate_vector_type): New func to validate the tree type is valid on flags. (riscv_return_value_is_vector_type_p): Leverage the func riscv_validate_vector_type to do the tree type validation. (riscv_arguments_is_vector_type_p): Diito. (riscv_override_options_internal): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-10.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-11.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-12.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-13.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-14.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-15.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-16.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-17.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-18.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-19.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-20.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-21.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-22.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-23.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-24.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-25.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-26.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-27.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-28.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-29.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-9.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-c.cc | 30 +- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-vector-builtins.cc | 7 +- gcc/config/riscv/riscv.cc | 101 -- .../target_attribute_v_with_intrinsic-10.c| 12 +++ .../target_attribute_v_with_intrinsic-11.c| 26 + .../target_attribute_v_with_intrinsic-12.c| 33 ++ .../target_attribute_v_with_intrinsic-13.c| 33 ++ .../target_attribute_v_with_intrinsic-14.c| 40 +++ .../target_attribute_v_with_intrinsic-15.c| 47 .../target_attribute_v_with_intrinsic-16.c| 12 +++ .../target_attribute_v_with_intrinsic-17.c| 13 +++ .../target_attribute_v_with_intrinsic-18.c| 13 +++ .../target_attribute_v_with_intrinsic-19.c| 13 +++ .../target_attribute_v_with_intrinsic-20.c| 13 +++ .../target_attribute_v_with_intrinsic-21.c| 13 +++ .../target_attribute_v_with_intrinsic-22.c| 13 +++ .../target_attribute_v_with_intrinsic-23.c| 13 +++ .../target_attribute_v_with_intrinsic-24.c
[PATCH] RISC-V: Fix misspelled term builtin in error message
From: Pan Li This patch would like to fix below misspelled term in error message. ../../gcc/config/riscv/riscv-vector-builtins.cc:4592:16: error: misspelled term 'builtin function' in format; use 'built-in function' instead [-Werror=format-diag] 4592 | "builtin function %qE requires the V ISA extension", exp); The below tests are passed for this patch. * The riscv regression test on rvv.exp and riscv.exp. gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (expand_builtin): Take the term built-in over builtin. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: Adjust test dg-error. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: Ditto. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-vector-builtins.cc | 2 +- .../riscv/rvv/base/target_attribute_v_with_intrinsic-7.c| 2 +- .../riscv/rvv/base/target_attribute_v_with_intrinsic-8.c| 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc index e07373d8b57..db9246eed2d 100644 --- a/gcc/config/riscv/riscv-vector-builtins.cc +++ b/gcc/config/riscv/riscv-vector-builtins.cc @@ -4589,7 +4589,7 @@ expand_builtin (unsigned int code, tree exp, rtx target) if (!TARGET_VECTOR) error_at (EXPR_LOCATION (exp), - "builtin function %qE requires the V ISA extension", exp); + "built-in function %qE requires the V ISA extension", exp); return function_expander (rfn.instance, rfn.decl, exp, target).expand (); } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c index 520b2e59fae..a4cd67f4f95 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c @@ -5,5 +5,5 @@ size_t test_1 (size_t vl) { - return __riscv_vsetvl_e8m4 (vl); /* { dg-error {builtin function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */ + return __riscv_vsetvl_e8m4 (vl); /* { dg-error {built-in function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */ } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c index 9032d9d0b43..06ed9a9eddc 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c @@ -19,5 +19,5 @@ test_2 () size_t test_3 (size_t vl) { - return __riscv_vsetvl_e8m4 (vl); /* { dg-error {builtin function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */ + return __riscv_vsetvl_e8m4 (vl); /* { dg-error {built-in function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */ } -- 2.34.1
[PATCH] RISC-V: Fix one unused varable in riscv_subset_list::parse
From: Pan Li This patch would like to fix one unused variable as below: ../../gcc/common/config/riscv/riscv-common.cc: In static member function 'static riscv_subset_list* riscv_subset_list::parse(const char*, location_t)': ../../gcc/common/config/riscv/riscv-common.cc:1501:19: error: unused variable 'itr' [-Werror=unused-variable] 1501 | riscv_subset_t *itr; The variable consume code was removed but missed the var itself in previous. Thus, we have unused variable here. gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_subset_list::parse): Remove unused var decl. Signed-off-by: Pan Li --- gcc/common/config/riscv/riscv-common.cc | 1 - 1 file changed, 1 deletion(-) diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index 7095f303cbb..43b7549e3ec 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -1498,7 +1498,6 @@ riscv_subset_list::parse (const char *arch, location_t loc) return NULL; riscv_subset_list *subset_list = new riscv_subset_list (arch, loc); - riscv_subset_t *itr; const char *p = arch; p = subset_list->parse_base_ext (p); if (p == NULL) -- 2.34.1
[PATCH v1] Internal-fn: Introduce new internal function SAT_ADD
From: Pan Li This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_ADD (1, 254) => 255. * SAT_ADD (1, 255) => 255. * SAT_ADD (2, 255) => 255. * SAT_ADD (255, 255) => 255. The patch also implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below example: uint64_t sat_add_u64 (uint64_t x, uint64_t y) { return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); } Before this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { long unsigned int _1; _Bool _2; long unsigned int _3; long unsigned int _4; uint64_t _7; long unsigned int _10; __complex__ long unsigned int _11; ;; basic block 2, loop depth 0 ;;pred: ENTRY _11 = .ADD_OVERFLOW (x_5(D), y_6(D)); _1 = REALPART_EXPR <_11>; _10 = IMAGPART_EXPR <_11>; _2 = _10 != 0; _3 = (long unsigned int) _2; _4 = -_3; _7 = _1 | _4; return _7; ;;succ: EXIT } After this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { uint64_t _7; ;; basic block 2, loop depth 0 ;;pred: ENTRY _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call] return _7; ;;succ: EXIT } For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd3 in vector mode. The riscv vector backend have insn "Vector Single-Width Saturating Add and Subtract" which can be leveraged when expand the usadd3 in vector mode. For example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i])); } Before this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]); ivtmp_58 = _80 * 8; vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0); vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0); vect__7.11_66 = vect__4.7_61 + vect__6.10_65; mask__8.12_67 = vect__4.7_61 > vect__7.11_66; vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, vect__7.11_66); .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72); vectp_x.5_60 = vectp_x.5_59 + ivtmp_58; vectp_y.8_64 = vectp_y.8_63 + ivtmp_58; vectp_out.16_75 = vectp_out.16_74 + ivtmp_58; ivtmp_79 = ivtmp_78 - _80; ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v0,0(a1) vle64.v v1,0(a2) sllia4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vadd.vv v1,v0,v1 vmsgtu.vv v0,v0,v1 vmerge.vim v1,v1,-1,v0 vse64.v v1,0(a0) ... After this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]); ivtmp_46 = _62 * 8; vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0); vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0); vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53); .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54); ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v1,0(a1) vle64.v v2,0(a2) sllia4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vsaddu.vv v1,v1,v2 vse64.v v1,0(a0) ... To limit the patch size for review, only unsigned version of usadd3 are involved here. The signed version will be covered in the underlying patch(es). The below test suites are passed for this patch. * The riscv fully regression tests. * The aarch64 fully regression tests. * The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * config/riscv/autovec.md (usadd3): New pattern expand for unsigned SAT_ADD vector. * config/riscv/riscv-protos.h (riscv_expand_usadd): New func decl to expand usadd3 pattern. (expand_vec_usadd): Ditto but for vector. * config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to emit the vsadd insn. (expand_vec_usadd): New func impl to expand usadd3 for vector. * config/riscv/riscv.cc (riscv_expand_usadd): New func impl to expand usadd3 for scalar. * config/riscv/riscv.md (usadd3): New pattern expand for unsigned SAT_ADD scalar. * config/riscv/vector.md: Allow VLS mode for vsaddu. * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD. * internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD. * match.pd: Add unsigned SAT_ADD match and simply. * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd.
[PATCH v2] Internal-fn: Introduce new internal function SAT_ADD
From: Pan Li Update in v2: * Fix one failure for x86 bootstrap. Original log: This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_ADD (1, 254) => 255. * SAT_ADD (1, 255) => 255. * SAT_ADD (2, 255) => 255. * SAT_ADD (255, 255) => 255. The patch also implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below example: uint64_t sat_add_u64 (uint64_t x, uint64_t y) { return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); } Before this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { long unsigned int _1; _Bool _2; long unsigned int _3; long unsigned int _4; uint64_t _7; long unsigned int _10; __complex__ long unsigned int _11; ;; basic block 2, loop depth 0 ;;pred: ENTRY _11 = .ADD_OVERFLOW (x_5(D), y_6(D)); _1 = REALPART_EXPR <_11>; _10 = IMAGPART_EXPR <_11>; _2 = _10 != 0; _3 = (long unsigned int) _2; _4 = -_3; _7 = _1 | _4; return _7; ;;succ: EXIT } After this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { uint64_t _7; ;; basic block 2, loop depth 0 ;;pred: ENTRY _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call] return _7; ;;succ: EXIT } For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd3 in vector mode. The riscv vector backend have insn "Vector Single-Width Saturating Add and Subtract" which can be leveraged when expand the usadd3 in vector mode. For example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i])); } Before this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]); ivtmp_58 = _80 * 8; vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0); vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0); vect__7.11_66 = vect__4.7_61 + vect__6.10_65; mask__8.12_67 = vect__4.7_61 > vect__7.11_66; vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, vect__7.11_66); .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72); vectp_x.5_60 = vectp_x.5_59 + ivtmp_58; vectp_y.8_64 = vectp_y.8_63 + ivtmp_58; vectp_out.16_75 = vectp_out.16_74 + ivtmp_58; ivtmp_79 = ivtmp_78 - _80; ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v0,0(a1) vle64.v v1,0(a2) sllia4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vadd.vv v1,v0,v1 vmsgtu.vv v0,v0,v1 vmerge.vim v1,v1,-1,v0 vse64.v v1,0(a0) ... After this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]); ivtmp_46 = _62 * 8; vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0); vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0); vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53); .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54); ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v1,0(a1) vle64.v v2,0(a2) sllia4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vsaddu.vv v1,v1,v2 vse64.v v1,0(a0) ... To limit the patch size for review, only unsigned version of usadd3 are involved here. The signed version will be covered in the underlying patch(es). The below test suites are passed for this patch. * The riscv fully regression tests. * The aarch64 fully regression tests. * The x86 bootstrap tests. * The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * config/riscv/autovec.md (usadd3): New pattern expand for unsigned SAT_ADD vector. * config/riscv/riscv-protos.h (riscv_expand_usadd): New func decl to expand usadd3 pattern. (expand_vec_usadd): Ditto but for vector. * config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to emit the vsadd insn. (expand_vec_usadd): New func impl to expand usadd3 for vector. * config/riscv/riscv.cc (riscv_expand_usadd): New func impl to expand usadd3 for scalar. * config/riscv/riscv.md (usadd3): New pattern expand for unsigned SAT_ADD scalar. * config/riscv/vector.md: Allow VLS mode for vsaddu. * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD. * internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD. * match.pd: Add unsigned SAT_ADD matc
[PATCH v1] RISC-V: Refine the error msg for RVV intrinisc required ext
From: Pan Li The RVV intrinisc API has sorts of required extension from both the march or target attribute. It will have error message similar to below: built-in function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension However, it is not accurate as we have many additional sub extenstion besides v extension. For example, zvbb, zvbk, zvbc ... etc. This patch would like to refine the error message with a friendly hint for the required extension. For example as below: vuint64m1_t __attribute__((target("arch=+v"))) test_1 (vuint64m1_t op_1, vuint64m1_t op_2, size_t vl) { return __riscv_vclmul_vv_u64m1 (op_1, op_2, vl); } When compile with march=rv64gc and target arch=+v, we will have error message as below: error: built-in function '__riscv_vclmul_vv_u64m1(op_1, op_2, vl)' requires the 'zvbc' ISA extension Then the end-user will get the point that the *zvbc* extension is missing for the intrinisc API easily. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-shapes.cc (build_one): Pass required_ext arg when invoke add function. (build_th_loadstore): Ditto. (struct vcreate_def): Ditto. (struct read_vl_def): Ditto. (struct vlenb_def): Ditto. * config/riscv/riscv-vector-builtins.cc (function_builder::add_function): Introduce new arg required_ext to fill in the register func. (function_builder::add_unique_function): Ditto. (function_builder::add_overloaded_function): Ditto. (expand_builtin): Leverage required_extensions_specified to check if the required extension is provided. * config/riscv/riscv-vector-builtins.h (reqired_ext_to_isa_name): New func impl to convert the required_ext enum to the extension name. (required_extensions_specified): New func impl to predicate if the required extension is well feeded. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: Adjust the error message for v extension. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: Ditto. * gcc.target/riscv/rvv/base/intrinsic_required_ext-1.c: New test. * gcc.target/riscv/rvv/base/intrinsic_required_ext-10.c: New test. * gcc.target/riscv/rvv/base/intrinsic_required_ext-2.c: New test. * gcc.target/riscv/rvv/base/intrinsic_required_ext-3.c: New test. * gcc.target/riscv/rvv/base/intrinsic_required_ext-4.c: New test. * gcc.target/riscv/rvv/base/intrinsic_required_ext-5.c: New test. * gcc.target/riscv/rvv/base/intrinsic_required_ext-6.c: New test. * gcc.target/riscv/rvv/base/intrinsic_required_ext-7.c: New test. * gcc.target/riscv/rvv/base/intrinsic_required_ext-8.c: New test. * gcc.target/riscv/rvv/base/intrinsic_required_ext-9.c: New test. Signed-off-by: Pan Li --- .../riscv/riscv-vector-builtins-shapes.cc | 18 +++-- gcc/config/riscv/riscv-vector-builtins.cc | 23 -- gcc/config/riscv/riscv-vector-builtins.h | 75 ++- .../riscv/rvv/base/intrinsic_required_ext-1.c | 10 +++ .../rvv/base/intrinsic_required_ext-10.c | 11 +++ .../riscv/rvv/base/intrinsic_required_ext-2.c | 11 +++ .../riscv/rvv/base/intrinsic_required_ext-3.c | 11 +++ .../riscv/rvv/base/intrinsic_required_ext-4.c | 11 +++ .../riscv/rvv/base/intrinsic_required_ext-5.c | 11 +++ .../riscv/rvv/base/intrinsic_required_ext-6.c | 11 +++ .../riscv/rvv/base/intrinsic_required_ext-7.c | 11 +++ .../riscv/rvv/base/intrinsic_required_ext-8.c | 11 +++ .../riscv/rvv/base/intrinsic_required_ext-9.c | 11 +++ .../target_attribute_v_with_intrinsic-7.c | 2 +- .../target_attribute_v_with_intrinsic-8.c | 2 +- 15 files changed, 210 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-9.c diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc b/gcc/config/riscv/riscv-vector-builtins-shapes.cc index c5ffcc1f2c4..7f983e82370 100644 --- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc +++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc @@ -72,9 +72,10 @@ build_one (function_builder
[PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch
From: Pan Li This patch would like to fix a ICE in mode sw for below example code. during RTL pass: mode_sw test.c: In function ‘vbool16_t j(vuint64m4_t)’: test.c:15:1: internal compiler error: in create_pre_exit, at mode-switching.cc:451 15 | } | ^ 0x3978f12 create_pre_exit __RISCV_BUILD__/../gcc/mode-switching.cc:451 0x3979e9e optimize_mode_switching __RISCV_BUILD__/../gcc/mode-switching.cc:849 0x397b9bc execute __RISCV_BUILD__/../gcc/mode-switching.cc:1324 extern size_t get_vl (); vbool16_t test (vuint64m4_t a) { unsigned long b; return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ()); } The create_pre_exit would like to find a return value copy. If not, there will be a reason in assert but not available for above sample code when vector calling convension is enabled by default. This patch would like to override the TARGET_FUNCTION_VALUE_REGNO_P for vector register and then we will have hard_regno_nregs for copy_num, aka there is a return value copy. As a side-effect of allow vector in TARGET_FUNCTION_VALUE_REGNO_P, the TARGET_GET_RAW_RESULT_MODE will have vector mode and which is sizeless cannot be converted to fixed_size_mode. Thus override the hook TARGET_GET_RAW_RESULT_MODE and return VOIDmode when the regno is-not-a fixed_size_mode. The below tests are passed for this patch. * The fully riscv regression tests. * The reproducing test in bugzilla PR114639. PR target/114639 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_function_value_regno_p): New func impl for hook TARGET_FUNCTION_VALUE_REGNO_P. (riscv_get_raw_result_mode): New func imple for hook TARGET_GET_RAW_RESULT_MODE. (TARGET_FUNCTION_VALUE_REGNO_P): Impl the hook. (TARGET_GET_RAW_RESULT_MODE): Ditto. * config/riscv/riscv.h (V_RETURN): New macro for vector return. (GP_RETURN_FIRST): New macro for the first GPR in return. (GP_RETURN_LAST): New macro for the last GPR in return. (FP_RETURN_FIRST): Diito but for FPR. (FP_RETURN_LAST): Ditto. (FUNCTION_VALUE_REGNO_P): Remove as deprecated and replace by TARGET_FUNCTION_VALUE_REGNO_P. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/pr114639-1.C: New test. * gcc.target/riscv/rvv/base/pr114639-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 34 +++ gcc/config/riscv/riscv.h | 8 +++-- .../g++.target/riscv/rvv/base/pr114639-1.C| 25 ++ .../gcc.target/riscv/rvv/base/pr114639-1.c| 14 4 files changed, 79 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/pr114639-1.C create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114639-1.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 00defa69fd8..91f017dd52a 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -10997,6 +10997,34 @@ riscv_vector_mode_supported_any_target_p (machine_mode) return true; } +/* Implements hook TARGET_FUNCTION_VALUE_REGNO_P. */ + +static bool +riscv_function_value_regno_p (const unsigned regno) +{ + if (GP_RETURN_FIRST <= regno && regno <= GP_RETURN_LAST) +return true; + + if (FP_RETURN_FIRST <= regno && regno <= FP_RETURN_LAST) +return true; + + if (regno == V_RETURN) +return true; + + return false; +} + +/* Implements hook TARGET_GET_RAW_RESULT_MODE. */ + +static fixed_size_mode +riscv_get_raw_result_mode (int regno) +{ + if (!is_a (reg_raw_mode[regno])) +return as_a (VOIDmode); + + return default_get_reg_raw_mode (regno); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" @@ -11343,6 +11371,12 @@ riscv_vector_mode_supported_any_target_p (machine_mode) #undef TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P #define TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P riscv_vector_mode_supported_any_target_p +#undef TARGET_FUNCTION_VALUE_REGNO_P +#define TARGET_FUNCTION_VALUE_REGNO_P riscv_function_value_regno_p + +#undef TARGET_GET_RAW_RESULT_MODE +#define TARGET_GET_RAW_RESULT_MODE riscv_get_raw_result_mode + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-riscv.h" diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index 269b8c1f076..7797e67317a 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -683,6 +683,12 @@ enum reg_class #define GP_RETURN GP_ARG_FIRST #define FP_RETURN (UNITS_PER_FP_ARG == 0 ? GP_RETURN : FP_ARG_FIRST) +#define V_RETURN V_REG_FIRST + +#define GP_RETURN_FIRST GP_ARG_FIRST +#define GP_RETURN_LAST GP_ARG_FIRST + 1 +#define FP_RETURN_FIRST FP_RETURN +#define FP_RETURN_LAST FP_RETURN + 1 #define MAX_ARGS_IN_REGISTERS \ (riscv_abi == ABI_ILP32E || riscv_abi == ABI_LP64E \ @@ -714,8 +720,6 @@ enum reg_class #define FUNCTION_VALUE(VALTYPE, FUNC) \ riscv_function_value (VALTY
[PATCH v1] RISC-V: Remove -Wno-psabi for test build option [NFC]
From: Pan Li Just notice there are some test case still have -Wno-psabi option, which is deprecated now. Remove them all for riscv test cases. The below test are passed for this patch. * The riscv rvv regression test. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/pr109244.C: Remove deprecated -Wno-psabi option. * g++.target/riscv/rvv/base/pr109535.C: Ditto. * gcc.target/riscv/rvv/autovec/fixed-vlmax-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-7.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-7.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-7.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-7.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: Ditto. Signed-off-by: Pan Li --- gcc/testsuite/g++.target/riscv/rvv/base/pr109244.C | 2 +- gcc/testsuite/g++.target/riscv/rvv/base/pr109535.C | 2 +- gcc/testsuite/gcc.target/riscv/rvv/autovec/fixed-vlmax-1.c | 2 +- .../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-1.c | 2 +- .../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-2.
[PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty args
From: Pan Li There is one corn case when similar as below example: void test (void) { __riscv_vfredosum_tu (); } It will meet ICE because of the implement details of overloaded function in gcc. According to the rvv intrinisc doc, we have no such overloaded function with empty args. Unfortunately, we register the empty args function as overloaded for avoiding conflict. Thus, there will be actual one register function after return NULL_TREE back to the middle-end, and finally result in ICE when expanding. For example: 1. First we registered void __riscv_vfredmax () as the overloaded function. 2. Then resolve_overloaded_builtin (this func) return NULL_TREE. 3. The functions register in step 1 bypass the args check as empty args. 4. Finally, fall into expand_builtin with empty args and meet ICE. Here we report error when overloaded function with empty args. For example: test.c: In function 'foo': test.c:8:3: error: no matching function call to '__riscv_vfredosum_tu' with empty args 8 | __riscv_vfredosum_tu(); | ^~~~ Below test are passed for this patch. * The riscv regression tests. PR target/113766 gcc/ChangeLog: * config/riscv/riscv-protos.h (resolve_overloaded_builtin): Adjust the signature of func. * config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): Ditto. * config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin): Make overloaded func with empty args error. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr113766-1.c: New test. * gcc.target/riscv/rvv/base/pr113766-2.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-c.cc | 3 +- gcc/config/riscv/riscv-protos.h | 2 +- gcc/config/riscv/riscv-vector-builtins.cc | 23 - .../gcc.target/riscv/rvv/base/pr113766-1.c| 85 +++ .../gcc.target/riscv/rvv/base/pr113766-2.c| 48 +++ 5 files changed, 155 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-2.c diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc index 2e306057347..94c3871c760 100644 --- a/gcc/config/riscv/riscv-c.cc +++ b/gcc/config/riscv/riscv-c.cc @@ -250,7 +250,8 @@ riscv_resolve_overloaded_builtin (unsigned int uncast_location, tree fndecl, case RISCV_BUILTIN_GENERAL: break; case RISCV_BUILTIN_VECTOR: - new_fndecl = riscv_vector::resolve_overloaded_builtin (subcode, arglist); + new_fndecl = riscv_vector::resolve_overloaded_builtin (loc, subcode, +fndecl, arglist); break; default: gcc_unreachable (); diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index b3f0bdb9924..ae1685850ac 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -560,7 +560,7 @@ gimple *gimple_fold_builtin (unsigned int, gimple_stmt_iterator *, gcall *); rtx expand_builtin (unsigned int, tree, rtx); bool check_builtin_call (location_t, vec, unsigned int, tree, unsigned int, tree *); -tree resolve_overloaded_builtin (unsigned int, vec *); +tree resolve_overloaded_builtin (location_t, unsigned int, tree, vec *); bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT); bool legitimize_move (rtx, rtx *); void emit_vlmax_vsetvl (machine_mode, rtx); diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc index 403e1021fd1..efcdc8f1767 100644 --- a/gcc/config/riscv/riscv-vector-builtins.cc +++ b/gcc/config/riscv/riscv-vector-builtins.cc @@ -4606,7 +4606,8 @@ check_builtin_call (location_t location, vec, unsigned int code, } tree -resolve_overloaded_builtin (unsigned int code, vec *arglist) +resolve_overloaded_builtin (location_t loc, unsigned int code, tree fndecl, + vec *arglist) { if (code >= vec_safe_length (registered_functions)) return NULL_TREE; @@ -4616,12 +4617,26 @@ resolve_overloaded_builtin (unsigned int code, vec *arglist) if (!rfun || !rfun->overloaded_p) return NULL_TREE; + /* According to the rvv intrinisc doc, we have no such overloaded function + with empty args. Unfortunately, we register the empty args function as + overloaded for avoiding conflict. Thus, there will actual one register + function after return NULL_TREE back to the middle-end, and finally result + in ICE when expanding. For example: + + 1. First we registered void __riscv_vfredmax () as the overloaded function. + 2. Then resolve_overloaded_builtin (this func) return NULL_TREE. + 3. The functions register in step 1 bypass the args check as empty args. + 4. Finally, fall into expand_builtin with empty args and meet ICE. + + Here we report error whe
[PATCH v1] RISC-V: Bugfix for RVV overloaded intrinsic ICE in function checker
From: Pan Li There is another corn case when similar as below example: void test (void) { __riscv_vaadd (); } We report error when overloaded function with empty args. For example: test.c: In function 'foo': test.c:8:3: error: no matching function call to '__riscv_vaadd' with empty args 8 | __riscv_vaadd (); | ^~~~ Unfortunately, it will meet another ICE similar to below after above message. The underlying build function checker will have zero args and break some assumption of the function checker. For example, the count of args is not less than 2. ice.c: In function ‘foo’: ice.c:8:3: internal compiler error: in require_immediate, at config/riscv/riscv-vector-builtins.cc:4252 8 | __riscv_vaadd (); | ^ 0x20b36ac riscv_vector::function_checker::require_immediate(unsigned int, long, long) const .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins.cc:4252 0x20b890c riscv_vector::alu_def::check(riscv_vector::function_checker&) const .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins-shapes.cc:387 0x20b38d7 riscv_vector::function_checker::check() .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins.cc:4315 0x20b4876 riscv_vector::check_builtin_call(unsigned int, vec, .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins.cc:4605 0x2069393 riscv_check_builtin_call .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-c.cc:227 Below test are passed for this patch. * The riscv regression tests. PR target/113766 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-shapes.cc (struct alu_def): Make sure the c.arg_num is >= 2 before checking. (struct build_frm_base): Ditto. (struct narrow_alu_def): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr113766-1.c: Add new cases. Signed-off-by: Pan Li --- .../riscv/riscv-vector-builtins-shapes.cc | 17 + .../gcc.target/riscv/rvv/base/pr113766-1.c | 16 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc b/gcc/config/riscv/riscv-vector-builtins-shapes.cc index 8e90b17a94b..c5ffcc1f2c4 100644 --- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc +++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc @@ -383,7 +383,10 @@ struct alu_def : public build_base /* Check whether rounding mode argument is a valid immediate. */ if (c.base->has_rounding_mode_operand_p ()) { - if (!c.any_type_float_p ()) + /* Some invalid overload intrinsic like below will have zero for + c.arg_num (). Thus, make sure arg_num is big enough here. + __riscv_vaadd () will make c.arg_num () == 0. */ + if (!c.any_type_float_p () && c.arg_num () >= 2) return c.require_immediate (c.arg_num () - 2, VXRM_RNU, VXRM_ROD); /* TODO: We will support floating-point intrinsic modeling rounding mode in the future. */ @@ -411,8 +414,11 @@ struct build_frm_base : public build_base { gcc_assert (c.any_type_float_p ()); -/* Check whether rounding mode argument is a valid immediate. */ -if (c.base->has_rounding_mode_operand_p ()) +/* Check whether rounding mode argument is a valid immediate. + Some invalid overload intrinsic like below will have zero for + c.arg_num (). Thus, make sure arg_num is big enough here. + __riscv_vaadd () will make c.arg_num () == 0. */ +if (c.base->has_rounding_mode_operand_p () && c.arg_num () >= 2) { unsigned int frm_num = c.arg_num () - 2; @@ -679,7 +685,10 @@ struct narrow_alu_def : public build_base /* Check whether rounding mode argument is a valid immediate. */ if (c.base->has_rounding_mode_operand_p ()) { - if (!c.any_type_float_p ()) + /* Some invalid overload intrinsic like below will have zero for + c.arg_num (). Thus, make sure arg_num is big enough here. + __riscv_vaadd () will make c.arg_num () == 0. */ + if (!c.any_type_float_p () && c.arg_num () >= 2) return c.require_immediate (c.arg_num () - 2, VXRM_RNU, VXRM_ROD); /* TODO: We will support floating-point intrinsic modeling rounding mode in the future. */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c index bd4943b0b7e..fd674a8895c 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c @@ -82,4 +82,20 @@ test () __riscv_vfredosum (); /* { dg-error {no matching function call to '__riscv_vfredosum' with empty args} } */ __riscv_vfredosum_tu (); /* { dg-error {no matching function call to '__riscv_vfredosum_tu' with empty args} } */ + + __riscv_vaadd (); /* { dg-error {no matching function call to '__riscv_vaadd'
[PATCH v1] RISC-V: Fix misspelled term args in error_at message
From: Pan Li When build with "-Werror=format-diag", there will be one misspelled term args as below. This patch would like fix it by taking the term arguments instead. ../../gcc/config/riscv/riscv-vector-builtins.cc: In function 'tree_node* riscv_vector::resolve_overloaded_builtin(location_t, unsigned int, tree, vec*)': ../../gcc/config/riscv/riscv-vector-builtins.cc:4633:65: error: misspelled term 'args' in format; use 'arguments' instead [-Werror=format-diag] 4633 | error_at (loc, "no matching function call to %qE with empty args", fndecl); gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin): Replace args to arguments for misspelled term. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr113766-1.c: Adjust the test cases. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-vector-builtins.cc | 3 +- .../gcc.target/riscv/rvv/base/pr113766-1.c| 126 +- 2 files changed, 65 insertions(+), 64 deletions(-) diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc index efcdc8f1767..c5881a501d1 100644 --- a/gcc/config/riscv/riscv-vector-builtins.cc +++ b/gcc/config/riscv/riscv-vector-builtins.cc @@ -4630,7 +4630,8 @@ resolve_overloaded_builtin (location_t loc, unsigned int code, tree fndecl, Here we report error when overloaded function with empty args. */ if (rfun->overloaded_p && arglist->length () == 0) -error_at (loc, "no matching function call to %qE with empty args", fndecl); +error_at (loc, "no matching function call to %qE with empty arguments", + fndecl); hashval_t hash = rfun->overloaded_hash (*arglist); registered_function *rfn diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c index fd674a8895c..9e911e31117 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c @@ -6,96 +6,96 @@ void test () { - __riscv_vand (); /* { dg-error {no matching function call to '__riscv_vand' with empty args} } */ - __riscv_vand_tu (); /* { dg-error {no matching function call to '__riscv_vand_tu' with empty args} } */ - __riscv_vand_tumu (); /* { dg-error {no matching function call to '__riscv_vand_tumu' with empty args} } */ + __riscv_vand (); /* { dg-error {no matching function call to '__riscv_vand' with empty arguments} } */ + __riscv_vand_tu (); /* { dg-error {no matching function call to '__riscv_vand_tu' with empty arguments} } */ + __riscv_vand_tumu (); /* { dg-error {no matching function call to '__riscv_vand_tumu' with empty arguments} } */ - __riscv_vcompress (); /* { dg-error {no matching function call to '__riscv_vcompress' with empty args} } */ - __riscv_vcompress_tu (); /* { dg-error {no matching function call to '__riscv_vcompress_tu' with empty args} } */ + __riscv_vcompress (); /* { dg-error {no matching function call to '__riscv_vcompress' with empty arguments} } */ + __riscv_vcompress_tu (); /* { dg-error {no matching function call to '__riscv_vcompress_tu' with empty arguments} } */ - __riscv_vcpop (); /* { dg-error {no matching function call to '__riscv_vcpop' with empty args} } */ + __riscv_vcpop (); /* { dg-error {no matching function call to '__riscv_vcpop' with empty arguments} } */ - __riscv_vdiv (); /* { dg-error {no matching function call to '__riscv_vdiv' with empty args} } */ - __riscv_vdiv_tu (); /* { dg-error {no matching function call to '__riscv_vdiv_tu' with empty args} } */ - __riscv_vdiv_tumu (); /* { dg-error {no matching function call to '__riscv_vdiv_tumu' with empty args} } */ + __riscv_vdiv (); /* { dg-error {no matching function call to '__riscv_vdiv' with empty arguments} } */ + __riscv_vdiv_tu (); /* { dg-error {no matching function call to '__riscv_vdiv_tu' with empty arguments} } */ + __riscv_vdiv_tumu (); /* { dg-error {no matching function call to '__riscv_vdiv_tumu' with empty arguments} } */ - __riscv_vfabs (); /* { dg-error {no matching function call to '__riscv_vfabs' with empty args} } */ - __riscv_vfabs_tu (); /* { dg-error {no matching function call to '__riscv_vfabs_tu' with empty args} } */ - __riscv_vfabs_tumu ();/* { dg-error {no matching function call to '__riscv_vfabs_tumu' with empty args} } */ + __riscv_vfabs (); /* { dg-error {no matching function call to '__riscv_vfabs' with empty arguments} } */ + __riscv_vfabs_tu (); /* { dg-error {no matching function call to '__riscv_vfabs_tu' with empty arguments} } */ + __riscv_vfabs_tumu ();/* { dg-error {no matching function call to '__riscv_vfabs_tumu' with empty arguments} } */ - __riscv_vfadd ();
[PATCH v1] Internal-fn: Add new internal function SAT_ADDU
From: Pan Li This patch would like to add the middle-end presentation for the unsigned saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADDU (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_ADDU (1, 254) => 255. * SAT_ADDU (1, 255) => 255. * SAT_ADDU (2, 255) => 255. * SAT_ADDU (255, 255) => 255. The patch also implement the SAT_ADDU in the riscv backend as the sample. Given below example: uint64_t sat_add_u64 (uint64_t x, uint64_t y) { return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); } Before this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { long unsigned int _1; _Bool _2; long unsigned int _3; long unsigned int _4; uint64_t _7; long unsigned int _10; __complex__ long unsigned int _11; ;; basic block 2, loop depth 0 ;;pred: ENTRY _11 = .ADD_OVERFLOW (x_5(D), y_6(D)); _1 = REALPART_EXPR <_11>; _10 = IMAGPART_EXPR <_11>; _2 = _10 != 0; _3 = (long unsigned int) _2; _4 = -_3; _7 = _1 | _4; return _7; ;;succ: EXIT } After this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { uint64_t _7; ;; basic block 2, loop depth 0 ;;pred: ENTRY _7 = .SAT_ADDU (x_5(D), y_6(D)); [tail call] return _7; ;;succ: EXIT } Then we will have the middle-end representation like .SAT_ADDU after this patch. PR target/51492 PR target/112600 gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_expand_saturation_addu): New func decl for the SAT_ADDU expand. * config/riscv/riscv.cc (riscv_expand_saturation_addu): New func impl for the SAT_ADDU expand. * config/riscv/riscv.md (sat_addu_3): New pattern to impl the standard name SAT_ADDU. * doc/md.texi: Add doc for SAT_ADDU. * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADDU. * internal-fn.def (SAT_ADDU): Add SAT_ADDU. * match.pd: Add simplify pattern patch for SAT_ADDU. * optabs.def (OPTAB_D): Add sat_addu_optab. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_addu-1.c: New test. * gcc.target/riscv/sat_addu-2.c: New test. * gcc.target/riscv/sat_addu-3.c: New test. * gcc.target/riscv/sat_addu-4.c: New test. * gcc.target/riscv/sat_addu-run-1.c: New test. * gcc.target/riscv/sat_addu-run-2.c: New test. * gcc.target/riscv/sat_addu-run-3.c: New test. * gcc.target/riscv/sat_addu-run-4.c: New test. * gcc.target/riscv/sat_arith.h: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv.cc | 46 + gcc/config/riscv/riscv.md | 11 + gcc/doc/md.texi | 11 + gcc/internal-fn.cc| 1 + gcc/internal-fn.def | 1 + gcc/match.pd | 22 + gcc/optabs.def| 2 + gcc/testsuite/gcc.target/riscv/sat_addu-1.c | 18 +++ gcc/testsuite/gcc.target/riscv/sat_addu-2.c | 20 gcc/testsuite/gcc.target/riscv/sat_addu-3.c | 17 +++ gcc/testsuite/gcc.target/riscv/sat_addu-4.c | 16 ++ .../gcc.target/riscv/sat_addu-run-1.c | 42 .../gcc.target/riscv/sat_addu-run-2.c | 42 .../gcc.target/riscv/sat_addu-run-3.c | 42 .../gcc.target/riscv/sat_addu-run-4.c | 49 +++ gcc/testsuite/gcc.target/riscv/sat_arith.h| 15 ++ 17 files changed, 356 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_arith.h diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index ae1685850ac..f201b2384f9 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -132,6 +132,7 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); extern bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); +extern void riscv_expand_saturation_addu (rtx, rtx, rtx); #ifdef RTX_CODE extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index
[PATCH v1] RISC-V: Upgrade RVV intrinsic version to 0.12
From: Pan Li Upgrade the version of RVV intrinsic from 0.11 to 0.12. PR target/114017 gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Upgrade the version to 0.12. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-__riscv_v_intrinsic.c: Update the version to 0.12. * gcc.target/riscv/rvv/base/pr114017-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-c.cc | 2 +- .../riscv/predef-__riscv_v_intrinsic.c| 2 +- .../gcc.target/riscv/rvv/base/pr114017-1.c| 19 +++ 3 files changed, 21 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc index 3ef06dcfd2d..3755ec0b8ef 100644 --- a/gcc/config/riscv/riscv-c.cc +++ b/gcc/config/riscv/riscv-c.cc @@ -139,7 +139,7 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile) { builtin_define ("__riscv_vector"); builtin_define_with_int_value ("__riscv_v_intrinsic", -riscv_ext_version_value (0, 11)); +riscv_ext_version_value (0, 12)); } if (TARGET_XTHEADVECTOR) diff --git a/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c b/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c index dbbedf54f87..07f1f159a8f 100644 --- a/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c +++ b/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c @@ -3,7 +3,7 @@ int main () { -#if __riscv_v_intrinsic != 11000 +#if __riscv_v_intrinsic != 12000 #error "__riscv_v_intrinsic" #endif diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c new file mode 100644 index 000..8eee7c68f71 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +vuint8mf2_t +test (vuint16m1_t val, size_t shift, size_t vl) +{ +#if __riscv_v_intrinsic == 11000 + #warning "RVV Intrinsics v0.11" + return __riscv_vnclipu (val, shift, vl); +#endif + +#if __riscv_v_intrinsic == 12000 + #warning "RVV Intrinsics v0.12" /* { dg-warning "RVV Intrinsics v0.12" } */ + return __riscv_vnclipu (val, shift, 0, vl); +#endif +} + -- 2.34.1
[PATCH v1] RISC-V: Introduce gcc option mrvv-vector-bits for RVV
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * 64 * 128 * 256 * 512 * 1024 * 2048 * 4096 * 8192 * 16384 * 32768 * 65536 * scalable * zvl 1. The scalable will be the default values which take min_vlen for the riscv_vector_chunks. 2. The zvl will pick up the zvl*b from the march option. For example, the mrvv-vector-bits will be 1024 when march=rv64gcv_zvl1024b. 3. Otherwise, it will take the value provide and complain error if none of above valid value is given. This option may influence the code gen when auto-vector. For example, void test_rvv_vector_bits (int *a, int *b, int *out) { for (int i = 0; i < 8; i++) out[i] = a[i] + b[i]; } It will generate code similar to below when build with -march=rv64gcv_zvl128b -mabi=lp64 -mrvv-vector-bits=zvl test_rvv_vector_bits: ... vsetivli zero,4,e32,m1,ta,ma vle32.v v1,0(a0) vle32.v v2,0(a1) vadd.vv v1,v1,v2 vse32.v v1,0(a2) ... vle32.v v1,0(a0) vle32.v v2,0(a1) vadd.vv v1,v1,v2 vse32.v v1,0(a2) And it will become more simply similar to below when build with -march=rv64gcv_zvl128b -mabi=lp64 -mrvv-vector-bits=256 test_rvv_vector_bits: ... vsetivli zero,8,e32,m2,ta,ma vle32.v v2,0(a0) vle32.v v4,0(a1) vadd.vv v2,v2,v4 vse32.v v2,0(a2) Passed the regression test of rvv. gcc/ChangeLog: * config/riscv/riscv-opts.h (enum rvv_vector_bits_enum): New enum for different RVV vector bits. * config/riscv/riscv.cc (riscv_convert_vector_bits): New func to get the RVV vector bits, with given min_vlen. (riscv_convert_vector_chunks): Combine the mrvv-vector-bits option with min_vlen to RVV vector chunks. (riscv_override_options_internal): Update comments and rename the vector chunks. * config/riscv/riscv.opt: Add option mrvv-vector-bits. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/rvv-vector-bits-1.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-2.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-3.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-4.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-opts.h | 16 ++ gcc/config/riscv/riscv.cc | 49 --- gcc/config/riscv/riscv.opt| 47 ++ .../riscv/rvv/base/rvv-vector-bits-1.c| 6 +++ .../riscv/rvv/base/rvv-vector-bits-2.c| 20 .../riscv/rvv/base/rvv-vector-bits-3.c| 25 ++ .../riscv/rvv/base/rvv-vector-bits-4.c| 6 +++ 7 files changed, 163 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-4.c diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h index 4edddbadc37..b2141190731 100644 --- a/gcc/config/riscv/riscv-opts.h +++ b/gcc/config/riscv/riscv-opts.h @@ -129,6 +129,22 @@ enum vsetvl_strategy_enum { VSETVL_OPT_NO_FUSION, }; +enum rvv_vector_bits_enum { + RVV_VECTOR_BITS_SCALABLE, + RVV_VECTOR_BITS_ZVL, + RVV_VECTOR_BITS_64 = 64, + RVV_VECTOR_BITS_128 = 128, + RVV_VECTOR_BITS_256 = 256, + RVV_VECTOR_BITS_512 = 512, + RVV_VECTOR_BITS_1024 = 1024, + RVV_VECTOR_BITS_2048 = 2048, + RVV_VECTOR_BITS_4096 = 4096, + RVV_VECTOR_BITS_8192 = 8192, + RVV_VECTOR_BITS_16384 = 16384, + RVV_VECTOR_BITS_32768 = 32768, + RVV_VECTOR_BITS_65536 = 65536, +}; + #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && TARGET_64BIT)) /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 5e984ee2a55..366d7ece383 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -8801,13 +8801,50 @@ riscv_init_machine_status (void) return ggc_cleared_alloc (); } -/* Return the VLEN value associated with -march. +static int +riscv_convert_vector_bits (int min_vlen) +{ + int rvv_bits = 0; + + switch (rvv_vector_bits) +{ + case RVV_VECTOR_BITS_SCALABLE: + case RVV_VECTOR_BITS_ZVL: + rvv_bits = min_vlen; + break; + case RVV_VECTOR_BITS_64: + case RVV_VECTOR_BITS_128: + case RVV_VECTOR_BITS_256: + case RVV_VECTOR_BITS_512: + case RVV_VECTOR_BITS_1024: + case RVV_VECTOR_BITS_2048: + case RVV_VECTOR_BITS_4096: + case RVV_VECTOR_BITS_8192: + case RVV_VECTOR_BITS_16384: + case RVV_VECTOR_BITS_32768: + case RVV_VECTOR_BITS_65536: + rvv_bits = rvv_vector_bits; +
[PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS
From: Pan Li Hi Richard & Tamar, Try the DEF_INTERNAL_INT_EXT_FN as your suggestion. By mapping us_plus$a3 to the RTL representation (us_plus:m x y) in optabs.def. And then expand_US_PLUS in internal-fn.cc. Not very sure if my understanding is correct for DEF_INTERNAL_INT_EXT_FN. I am not sure if we still need DEF_INTERNAL_SIGNED_OPTAB_FN here, given the RTL representation has (ss_plus:m x y) and (us_plus:m x y) already. Note this patch is a draft for validation, no test are invovled here. gcc/ChangeLog: * builtins.def (BUILT_IN_US_PLUS): Add builtin def. (BUILT_IN_US_PLUSIMAX): Ditto. (BUILT_IN_US_PLUSL): Ditto. (BUILT_IN_US_PLUSLL): Ditto. (BUILT_IN_US_PLUSG): Ditto. * config/riscv/riscv-protos.h (riscv_expand_us_plus): Add new func decl for expanding us_plus. * config/riscv/riscv.cc (riscv_expand_us_plus): Add new func impl for expanding us_plus. * config/riscv/riscv.md (us_plus3): Add new pattern impl us_plus3. * internal-fn.cc (expand_US_PLUS): Add new func impl to expand US_PLUS. * internal-fn.def (US_PLUS): Add new INT_EXT_FN. * internal-fn.h (expand_US_PLUS): Add new func decl. * match.pd: Add new simplify pattern for us_plus. * optabs.def (OPTAB_NL): Add new OPTAB_NL to US_PLUS rtl. Signed-off-by: Pan Li --- gcc/builtins.def| 7 + gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv.cc | 46 + gcc/config/riscv/riscv.md | 11 gcc/internal-fn.cc | 26 +++ gcc/internal-fn.def | 3 +++ gcc/internal-fn.h | 1 + gcc/match.pd| 17 gcc/optabs.def | 2 ++ 9 files changed, 114 insertions(+) diff --git a/gcc/builtins.def b/gcc/builtins.def index f6f3e104f6a..0777b912cfa 100644 --- a/gcc/builtins.def +++ b/gcc/builtins.def @@ -1055,6 +1055,13 @@ DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTIMAX, "popcountimax", BT_FN_INT_UINTMAX DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTL, "popcountl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTLL, "popcountll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTG, "popcountg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) + +DEF_GCC_BUILTIN(BUILT_IN_US_PLUS, "us_plus", BT_FN_INT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN(BUILT_IN_US_PLUSIMAX, "us_plusimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN(BUILT_IN_US_PLUSL, "us_plusl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN(BUILT_IN_US_PLUSLL, "us_plusll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN(BUILT_IN_US_PLUSG, "us_plusg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) + DEF_EXT_LIB_BUILTIN(BUILT_IN_POSIX_MEMALIGN, "posix_memalign", BT_FN_INT_PTRPTR_SIZE_SIZE, ATTR_NOTHROW_NONNULL_LEAF) DEF_GCC_BUILTIN(BUILT_IN_PREFETCH, "prefetch", BT_FN_VOID_CONST_PTR_VAR, ATTR_NOVOPS_LEAF_LIST) DEF_LIB_BUILTIN(BUILT_IN_REALLOC, "realloc", BT_FN_PTR_PTR_SIZE, ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 80efdf2b7e5..ba6086f1f25 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -132,6 +132,7 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); extern bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); +extern void riscv_expand_us_plus (rtx, rtx, rtx); #ifdef RTX_CODE extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 4100abc9dd1..23f08974f07 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -10657,6 +10657,52 @@ riscv_vector_mode_supported_any_target_p (machine_mode) return true; } +/* Emit insn for the saturation addu, aka (x + y) | - ((x + y) < x). */ +void +riscv_expand_us_plus (rtx dest, rtx x, rtx y) +{ + machine_mode mode = GET_MODE (dest); + rtx pmode_sum = gen_reg_rtx (Pmode); + rtx pmode_lt = gen_reg_rtx (Pmode); + rtx pmode_x = gen_lowpart (Pmode, x); + rtx pmode_y = gen_lowpart (Pmode, y); + rtx pmode_dest = gen_reg_rtx (Pmode); + + /* Step-1: sum = x + y */ + if (mode == SImode && mode != Pmode) +{ /* Take addw to avoid the sum truncate. */ + rtx simode_sum = gen_reg_rtx (SImode); + riscv_emit_binary (PLUS, simode_sum, x, y); + emit_move_insn (pmode_sum, gen_lowpart (Pmode, simode_sum)); +} + else +riscv_emit_binary (PLUS, pmode_sum, pmode_x, pmode_y); + + /* Step-1.1: truncate sum for HI and QI as we have no insn for add QI/HI.
[PATCH v1] RTL: Bugfix ICE after allow vector type in DSE
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, we missed to adjust the validate_subreg part accordingly. For vector type, we don't need to restrict the mode size is greater than the vector register size. Thus, for example when gen_lowpart from E_V2SFmode to E_V4QImode, it will have NULL_RTX(of course ICE after that) because of the mode size is less than vector register size. That also explain that gen_lowpart from E_V8SFmode to E_V16QImode is valid here. This patch would like to remove the the restriction for vector mode, to rid of the ICE when gen_lowpart because of validate_subreg fails. The below test are passed for this patch: * The X86 bootstrap test. * The fully riscv regression tests. gcc/ChangeLog: * emit-rtl.cc (validate_subreg): Bypass register size check if the mode is vector. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ssa-fre-44.c: Add ftree-vectorize to trigger the ICE. * gcc.target/riscv/rvv/base/bug-6.c: New test. Signed-off-by: Pan Li --- gcc/emit-rtl.cc | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c| 2 +- .../gcc.target/riscv/rvv/base/bug-6.c | 22 +++ 3 files changed, 25 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc index 1856fa4884f..45c6301b487 100644 --- a/gcc/emit-rtl.cc +++ b/gcc/emit-rtl.cc @@ -934,7 +934,8 @@ validate_subreg (machine_mode omode, machine_mode imode, ; /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though store_bit_field is the culprit here, and not the backends. */ - else if (known_ge (osize, regsize) && known_ge (isize, osize)) + else if (known_ge (isize, osize) && (known_ge (osize, regsize) +|| (VECTOR_MODE_P (imode) || VECTOR_MODE_P (omode ; /* Allow component subregs of complex and vector. Though given the below extraction rules, it's not always clear what that means. */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c index f79b4c142ae..624a00a4f32 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O -fdump-tree-fre1" } */ +/* { dg-options "-O -fdump-tree-fre1 -O3 -ftree-vectorize" } */ struct A { float x, y; }; struct B { struct A u; }; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c new file mode 100644 index 000..5bb00b8f587 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c @@ -0,0 +1,22 @@ +/* Test that we do not have ice when compile */ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */ + +struct A { float x, y; }; +struct B { struct A u; }; + +extern void bar (struct A *); + +float +f3 (struct B *x, int y) +{ + struct A p = {1.0f, 2.0f}; + struct A *q = &x[y].u; + + __builtin_memcpy (&q->x, &p.x, sizeof (float)); + __builtin_memcpy (&q->y, &p.y, sizeof (float)); + + bar (&p); + + return x[y].u.x + x[y].u.y; +} -- 2.34.1
[PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, we missed to adjust the validate_subreg part accordingly. When the vector type's size is less than vector register, it will be considered as invalid in the validate_subreg. Consider the validate_subreg is kind of a can with worms and we are in stage 4. We will fix the issue from the DES side, and make sure the subreg is valid for both the read_mode and store_mode before perform the real gen_lowpart. The below test are passed for this patch: * The x86 bootstrap test. * The x86 regression test. * The riscv regression test. * The aarch64 regression test. gcc/ChangeLog: * dse.cc (get_stored_val): Add validate_subreg check before perform the gen_lowpart for rtl. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ssa-fre-44.c: Add compile option to trigger the ICE. * gcc.target/riscv/rvv/base/bug-6.c: New test. Signed-off-by: Pan Li --- gcc/dse.cc| 4 +++- gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c| 2 +- .../gcc.target/riscv/rvv/base/bug-6.c | 22 +++ 3 files changed, 26 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c diff --git a/gcc/dse.cc b/gcc/dse.cc index edc7a1dfecf..1596da91da0 100644 --- a/gcc/dse.cc +++ b/gcc/dse.cc @@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode read_mode, copy_rtx (store_info->const_rhs)); else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode) && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode)) -&& targetm.modes_tieable_p (read_mode, store_mode)) +&& targetm.modes_tieable_p (read_mode, store_mode) +&& validate_subreg (read_mode, store_mode, copy_rtx (store_info->rhs), + subreg_lowpart_offset (read_mode, store_mode))) read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs)); else read_reg = extract_low_bits (read_mode, store_mode, diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c index f79b4c142ae..624a00a4f32 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O -fdump-tree-fre1" } */ +/* { dg-options "-O -fdump-tree-fre1 -O3 -ftree-vectorize" } */ struct A { float x, y; }; struct B { struct A u; }; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c new file mode 100644 index 000..5bb00b8f587 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c @@ -0,0 +1,22 @@ +/* Test that we do not have ice when compile */ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */ + +struct A { float x, y; }; +struct B { struct A u; }; + +extern void bar (struct A *); + +float +f3 (struct B *x, int y) +{ + struct A p = {1.0f, 2.0f}; + struct A *q = &x[y].u; + + __builtin_memcpy (&q->x, &p.x, sizeof (float)); + __builtin_memcpy (&q->y, &p.y, sizeof (float)); + + bar (&p); + + return x[y].u.x + x[y].u.y; +} -- 2.34.1
[PATCH v2] RISC-V: Introduce gcc option mrvv-vector-bits for RVV
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * zvl The zvl will pick up the zvl*b from the march option. For example, the mrvv-vector-bits will be 1024 when march=rv64gcv_zvl1024b. The below test are passed for this patch. * The riscv fully regression test. gcc/ChangeLog: * config/riscv/riscv-opts.h (enum rvv_vector_bits_enum): New enum for different RVV vector bits. * config/riscv/riscv.cc (riscv_convert_vector_bits): New func to get the RVV vector bits, with given min_vlen. (riscv_convert_vector_chunks): Combine the mrvv-vector-bits option with min_vlen to RVV vector chunks. (riscv_override_options_internal): Update comments and rename the vector chunks. * config/riscv/riscv.opt: Add option mrvv-vector-bits. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/rvv-vector-bits-1.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-2.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-3.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-opts.h | 7 + gcc/config/riscv/riscv.cc | 31 +++ gcc/config/riscv/riscv.opt| 11 +++ .../riscv/rvv/base/rvv-vector-bits-1.c| 7 + .../riscv/rvv/base/rvv-vector-bits-2.c| 7 + .../riscv/rvv/base/rvv-vector-bits-3.c| 25 +++ 6 files changed, 82 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-3.c diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h index 4edddbadc37..0162e00515b 100644 --- a/gcc/config/riscv/riscv-opts.h +++ b/gcc/config/riscv/riscv-opts.h @@ -129,6 +129,13 @@ enum vsetvl_strategy_enum { VSETVL_OPT_NO_FUSION, }; +/* RVV vector bits for option -mrvv-vector-bits + zvl indicates take the bits of zvl*b provided by march as vector bits. + */ +enum rvv_vector_bits_enum { + RVV_VECTOR_BITS_ZVL, +}; + #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && TARGET_64BIT)) /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 5e984ee2a55..d18e5226bce 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -8801,13 +8801,32 @@ riscv_init_machine_status (void) return ggc_cleared_alloc (); } -/* Return the VLEN value associated with -march. +static int +riscv_convert_vector_bits (int min_vlen) +{ + int rvv_bits = 0; + + switch (rvv_vector_bits) +{ + case RVV_VECTOR_BITS_ZVL: + rvv_bits = min_vlen; + break; + default: + gcc_unreachable (); +} + + return rvv_bits; +} + +/* Return the VLEN value associated with -march and -mwrvv-vector-bits. TODO: So far we only support length-agnostic value. */ static poly_uint16 -riscv_convert_vector_bits (struct gcc_options *opts) +riscv_convert_vector_chunks (struct gcc_options *opts) { int chunk_num; int min_vlen = TARGET_MIN_VLEN_OPTS (opts); + int rvv_bits = riscv_convert_vector_bits (min_vlen); + if (min_vlen > 32) { /* When targetting minimum VLEN > 32, we should use 64-bit chunk size. @@ -8826,7 +8845,7 @@ riscv_convert_vector_bits (struct gcc_options *opts) - TARGET_MIN_VLEN = 2048bit: [256,256] - TARGET_MIN_VLEN = 4096bit: [512,512] FIXME: We currently DON'T support TARGET_MIN_VLEN > 4096bit. */ - chunk_num = min_vlen / 64; + chunk_num = rvv_bits / 64; } else { @@ -8848,7 +8867,7 @@ riscv_convert_vector_bits (struct gcc_options *opts) if (TARGET_VECTOR_OPTS_P (opts)) { if (opts->x_riscv_autovec_preference == RVV_FIXED_VLMAX) - return (int) min_vlen / (riscv_bytes_per_vector_chunk * 8); + return (int) rvv_bits / (riscv_bytes_per_vector_chunk * 8); else return poly_uint16 (chunk_num, chunk_num); } @@ -8920,8 +8939,8 @@ riscv_override_options_internal (struct gcc_options *opts) if (TARGET_VECTOR && TARGET_BIG_ENDIAN) sorry ("Current RISC-V GCC does not support RVV in big-endian mode"); - /* Convert -march to a chunks count. */ - riscv_vector_chunks = riscv_convert_vector_bits (opts); + /* Convert -march and -mrvv-vector-bits to a chunks count. */ + riscv_vector_chunks = riscv_convert_vector_chunks (opts); } /* Implement TARGET_OPTION_OVERRIDE. */ diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index 20685c42aed..42ea8efd05d 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -607,3 +607,14 @@ Enum(stringop_strategy) String(vector) Value(ST
[PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * scalable * zvl The scalable will pick up the zvl*b in the march as the minimal vlen. For example, the minimal vlen will be 512 when march=rv64gcv_zvl512b and mrvv-vector-bits=scalable. The zvl will pick up the zvl*b in the march as exactly vlen. For example, the vlen will be 1024 exactly when march=rv64gcv_zvl1024b and mrvv-vector-bits=zvl. Given below sample: void test_rvv_vector_bits () { vint32m1_t x; asm volatile ("def %0": "=vr"(x)); asm volatile (""::: "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31"); asm volatile ("use %0": : "vr"(x)); } With -march=rv64gcv_zvl128b -mrvv-vector-bits=scalable we have (for min_vlen >= 128) csrrt0,vlenb sub sp,sp,t0 def v1 vs1r.v v1,0(sp) vl1re32.v v1,0(sp) use v1 csrrt0,vlenb add sp,sp,t0 jr ra With -march=rv64gcv_zvl128b -mrvv-vector-bits=zvl we have (for vlen = 128) addisp,sp,-16 def v1 vs1r.v v1,0(sp) vl1re32.v v1,0(sp) use v1 addisp,sp,16 jr ra The below test are passed for this patch. * The riscv fully regression test. gcc/ChangeLog: * config/riscv/riscv-opts.h (enum rvv_vector_bits_enum): New enum for different RVV vector bits. * config/riscv/riscv.cc (riscv_convert_vector_bits): New func to get the RVV vector bits, with given min_vlen. (riscv_convert_vector_chunks): Combine the mrvv-vector-bits option with min_vlen to RVV vector chunks. (riscv_override_options_internal): Update comments and rename the vector chunks. * config/riscv/riscv.opt: Add option mrvv-vector-bits. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/rvv-vector-bits-1.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-2.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-3.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-4.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-5.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-6.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-opts.h | 8 + gcc/config/riscv/riscv.cc | 35 +++ gcc/config/riscv/riscv.opt| 14 .../riscv/rvv/base/rvv-vector-bits-1.c| 7 .../riscv/rvv/base/rvv-vector-bits-2.c| 7 .../riscv/rvv/base/rvv-vector-bits-3.c| 9 + .../riscv/rvv/base/rvv-vector-bits-4.c| 9 + .../riscv/rvv/base/rvv-vector-bits-5.c| 17 + .../riscv/rvv/base/rvv-vector-bits-6.c| 17 + 9 files changed, 116 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-6.c diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h index 4edddbadc37..eefd2f9e01c 100644 --- a/gcc/config/riscv/riscv-opts.h +++ b/gcc/config/riscv/riscv-opts.h @@ -129,6 +129,14 @@ enum vsetvl_strategy_enum { VSETVL_OPT_NO_FUSION, }; +/* RVV vector bits for option -mrvv-vector-bits + zvl indicates take the bits of zvl*b provided by march as vector bits. + */ +enum rvv_vector_bits_enum { + RVV_VECTOR_BITS_SCALABLE, + RVV_VECTOR_BITS_ZVL, +}; + #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && TARGET_64BIT)) /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 5e984ee2a55..b6b133210ff 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -8801,13 +8801,33 @@ riscv_init_machine_status (void) return ggc_cleared_alloc (); } -/* Return the VLEN value associated with -march. +static int +riscv_convert_vector_bits (int min_vlen) +{ + int rvv_bits = 0; + + switch (rvv_vector_bits) +{ + case RVV_VECTOR_BITS_ZVL: + case RVV_VECTOR_BITS_SCALABLE: + rvv_bits = min_vlen; + break; + default: + gcc_unreachable (); +} + + return rvv_bits; +} + +/* Return the VLEN value associated with -march and -mwrvv-vector-bits. TODO: So far we only support length-agnostic value. */ static poly_uint16 -riscv_conv
[PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * scalable * zvl The scalable will pick up the zvl*b in the march as the minimal vlen. For example, the minimal vlen will be 512 when march=rv64gcv_zvl512b and mrvv-vector-bits=scalable. The zvl will pick up the zvl*b in the march as exactly vlen. For example, the vlen will be 1024 exactly when march=rv64gcv_zvl1024b and mrvv-vector-bits=zvl. Given below sample: void test_rvv_vector_bits () { vint32m1_t x; asm volatile ("def %0": "=vr"(x)); asm volatile (""::: "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31"); asm volatile ("use %0": : "vr"(x)); } With -march=rv64gcv_zvl128b -mrvv-vector-bits=scalable we have (for min_vlen >= 128) csrrt0,vlenb sub sp,sp,t0 def v1 vs1r.v v1,0(sp) vl1re32.v v1,0(sp) use v1 csrrt0,vlenb add sp,sp,t0 jr ra With -march=rv64gcv_zvl128b -mrvv-vector-bits=zvl we have (for vlen = 128) addisp,sp,-16 def v1 vs1r.v v1,0(sp) vl1re32.v v1,0(sp) use v1 addisp,sp,16 jr ra The below test are passed for this patch. * The riscv fully regression test. gcc/ChangeLog: * config/riscv/riscv-opts.h (enum rvv_vector_bits_enum): New enum for different RVV vector bits. * config/riscv/riscv.cc (riscv_convert_vector_bits): New func to get the RVV vector bits, with given min_vlen. (riscv_convert_vector_chunks): Combine the mrvv-vector-bits option with min_vlen to RVV vector chunks. (riscv_override_options_internal): Update comments and rename the vector chunks. * config/riscv/riscv.opt: Add option mrvv-vector-bits. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/rvv-vector-bits-1.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-2.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-3.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-4.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-5.c: New test. * gcc.target/riscv/rvv/base/rvv-vector-bits-6.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-opts.h | 8 + gcc/config/riscv/riscv.cc | 35 +++ gcc/config/riscv/riscv.opt| 14 .../riscv/rvv/base/rvv-vector-bits-1.c| 7 .../riscv/rvv/base/rvv-vector-bits-2.c| 7 .../riscv/rvv/base/rvv-vector-bits-3.c| 9 + .../riscv/rvv/base/rvv-vector-bits-4.c| 9 + .../riscv/rvv/base/rvv-vector-bits-5.c| 17 + .../riscv/rvv/base/rvv-vector-bits-6.c| 17 + 9 files changed, 116 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-6.c diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h index 4edddbadc37..2a311c9d2a3 100644 --- a/gcc/config/riscv/riscv-opts.h +++ b/gcc/config/riscv/riscv-opts.h @@ -129,6 +129,14 @@ enum vsetvl_strategy_enum { VSETVL_OPT_NO_FUSION, }; +/* RVV vector bits for option -mrvv-vector-bits, default is scalable. */ +enum rvv_vector_bits_enum { + /* scalable indicates taking the value of zvl*b as the minimal vlen. */ + RVV_VECTOR_BITS_SCALABLE, + /* zvl indicates taking the value of zvl*b as the exactly vlen. */ + RVV_VECTOR_BITS_ZVL, +}; + #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && TARGET_64BIT)) /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 5e984ee2a55..b6b133210ff 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -8801,13 +8801,33 @@ riscv_init_machine_status (void) return ggc_cleared_alloc (); } -/* Return the VLEN value associated with -march. +static int +riscv_convert_vector_bits (int min_vlen) +{ + int rvv_bits = 0; + + switch (rvv_vector_bits) +{ + case RVV_VECTOR_BITS_ZVL: + case RVV_VECTOR_BITS_SCALABLE: + rvv_bits = min_vlen; + break; + default: + gcc_unreachable (); +} + + return rvv_bits; +} + +/* Return the VLEN value associated with -march and -mwrvv-vector-bit
[PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion
From: Pan Li The insert_var_expansion_initialization depends on the HONOR_SIGNED_ZEROS to initialize the unrolling variables to +0.0f when -0.0f and no-signed-option. Unfortunately, we should always keep the -0.0f here because: * The -0.0f is always the correct initial value. * We need to support the target that always honor signed zero. Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize instead of HONOR_SIGNED_ZEROS. Then the target/backend can decide to honor the no-signed-zero or not. The below tests are passed for this patch: * The riscv regression tests. * The aarch64 regression tests. * The x86 bootstrap and regression tests. gcc/ChangeLog: * loop-unroll.cc (insert_var_expansion_initialization): Leverage MODE_HAS_SIGNED_ZEROS for expansion variable initialization. gcc/testsuite/ChangeLog: * gcc.dg/pr30957-1.c: Adjust tests cases for different scenarios. Signed-off-by: Pan Li --- gcc/loop-unroll.cc | 4 +-- gcc/testsuite/gcc.dg/pr30957-1.c | 48 2 files changed, 44 insertions(+), 8 deletions(-) diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc index 4176a21e308..bfdfe6c2bb7 100644 --- a/gcc/loop-unroll.cc +++ b/gcc/loop-unroll.cc @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, rtx var, zero_init; unsigned i; machine_mode mode = GET_MODE (ve->reg); - bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode); + bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode); if (ve->var_expansions.length () == 0) return; @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, case MINUS: FOR_EACH_VEC_ELT (ve->var_expansions, i, var) { - if (honor_signed_zero_p) + if (has_signed_zero_p) zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); else zero_init = CONST0_RTX (mode); diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c index 564410913ab..6a9d3d87932 100644 --- a/gcc/testsuite/gcc.dg/pr30957-1.c +++ b/gcc/testsuite/gcc.dg/pr30957-1.c @@ -20,16 +20,52 @@ foo (float d, int n) return accum; } +float __attribute__((noinline)) +get_minus_zero() +{ + return 0.0 / -5.0; +} + int main () { - /* When compiling standard compliant we expect foo to return -0.0. But the - variable expansion during unrolling optimization (for this testcase enabled - by non-compliant -fassociative-math) instantiates copy(s) of the - accumulator which it initializes with +0.0. Hence we expect that foo - returns +0.0. */ - if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) + /* The variable expansion in unroll requires option unsafe-math-optimizations + (aka -fno-signed-zeros, -fno-trapping-math, -fassociative-math + and -freciprocal-math). + + When loop like above will have expansion after unrolling as below: + + accum_1 += d_1; + accum_2 += d_2; + accum_3 += d_3; + ... + + The accum_1, accum_2 and accum_3 need to be initialized. Given the + floating-point we have + +0.0f + -0.0f = +0.0f. + + Thus, we should initialize the accum_* to -0.0 for correctness. But + the things become more complicated when no-signed-zeros, as well as VLA + vectorizer mode which doesn't trigger variable expansion. Then we have: + + Case 1: Trigger variable expansion but target doesn't honor no-signed-zero. + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. + Case 2: Trigger variable expansion but target does honor no-signed-zero. + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. + Case 3: No variable expansion but target doesn't honor no-signed-zero. + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. + Case 4: No variable expansion but target does honor no-signed-zero. + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. + + The test case covers above 4 cases for running. + */ + float minus_zero = get_minus_zero (); + float a = __builtin_copysignf (1.0, minus_zero); + float b = __builtin_copysignf (1.0, foo (minus_zero, 10)); + + if (a != b) abort (); + exit (0); } -- 2.34.1
[PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion
From: Pan Li The insert_var_expansion_initialization depends on the HONOR_SIGNED_ZEROS to initialize the unrolling variables to +0.0f when -0.0f and no-signed-option. Unfortunately, we should always keep the -0.0f here because: * The -0.0f is always the correct initial value. * We need to support the target that always honor signed zero. Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize instead of HONOR_SIGNED_ZEROS. Then the target/backend can decide to honor the no-signed-zero or not. We also removed the testcase pr30957-1.c, as it makes undefined behavior whether the return value is positive or negative. The below tests are passed for this patch: * The riscv regression tests. * The aarch64 regression tests. * The x86 bootstrap and regression tests. gcc/ChangeLog: * loop-unroll.cc (insert_var_expansion_initialization): Leverage MODE_HAS_SIGNED_ZEROS for expansion variable initialization. gcc/testsuite/ChangeLog: * gcc.dg/pr30957-1.c: Remove. Signed-off-by: Pan Li --- gcc/loop-unroll.cc | 4 ++-- gcc/testsuite/gcc.dg/pr30957-1.c | 36 2 files changed, 2 insertions(+), 38 deletions(-) delete mode 100644 gcc/testsuite/gcc.dg/pr30957-1.c diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc index 4176a21e308..bfdfe6c2bb7 100644 --- a/gcc/loop-unroll.cc +++ b/gcc/loop-unroll.cc @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, rtx var, zero_init; unsigned i; machine_mode mode = GET_MODE (ve->reg); - bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode); + bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode); if (ve->var_expansions.length () == 0) return; @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, case MINUS: FOR_EACH_VEC_ELT (ve->var_expansions, i, var) { - if (honor_signed_zero_p) + if (has_signed_zero_p) zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); else zero_init = CONST0_RTX (mode); diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c deleted file mode 100644 index 564410913ab..000 --- a/gcc/testsuite/gcc.dg/pr30957-1.c +++ /dev/null @@ -1,36 +0,0 @@ -/* { dg-do run { xfail { mmix-*-* } } } */ -/* We don't (and don't want to) perform this optimisation on soft-float targets, - where each addition is a library call. / -/* { dg-require-effective-target hard_float } */ -/* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */ -/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */ - -extern void abort (void); -extern void exit (int); - -float __attribute__((noinline)) -foo (float d, int n) -{ - unsigned i; - float accum = d; - - for (i = 0; i < n; i++) -accum += d; - - return accum; -} - -int -main () -{ - /* When compiling standard compliant we expect foo to return -0.0. But the - variable expansion during unrolling optimization (for this testcase enabled - by non-compliant -fassociative-math) instantiates copy(s) of the - accumulator which it initializes with +0.0. Hence we expect that foo - returns +0.0. */ - if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) -abort (); - exit (0); -} - -/* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */ -- 2.34.1
[PATCH v1] RISC-V: Update the comments of riscv_v_ext_mode_p [NFC]
From: Pan Li gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_ext_mode_p): Update the comments of predicate func riscv_v_ext_mode_p. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index df9799d9c5e..f829014a589 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -1361,7 +1361,10 @@ riscv_v_ext_vls_mode_p (machine_mode mode) return false; } -/* Return true if it is either RVV vector mode or RVV tuple mode. */ +/* Return true if it is either of below modes. + 1. RVV vector mode. + 2. RVV tuple mode. + 3. RVV vls mode. */ static bool riscv_v_ext_mode_p (machine_mode mode) -- 2.34.1
[PATCH v1] RISC-V: Fix asm checks regression due to recent middle-end change
From: Pan Li The recent middle-end change result in some asm check failures. This patch would like to fix the asm check by adjust the times. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/shift-1.c: Fix asm check count. * gcc.target/riscv/rvv/autovec/vls/shift-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/shift-3.c: Ditto. Signed-off-by: Pan Li --- gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c | 2 +- gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c | 2 +- gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c index e57a0b6bdf3..cb5a1dbc9ff 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c @@ -53,5 +53,5 @@ DEF_OP_VV (shift, 128, int64_t, >>) DEF_OP_VV (shift, 256, int64_t, >>) DEF_OP_VV (shift, 512, int64_t, >>) -/* { dg-final { scan-assembler-times {vsra\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 39 } } */ +/* { dg-final { scan-assembler-times {vsra\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 42 } } */ /* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c index 9d1fa64232c..e626a52c2d8 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c @@ -53,5 +53,5 @@ DEF_OP_VV (shift, 128, uint64_t, >>) DEF_OP_VV (shift, 256, uint64_t, >>) DEF_OP_VV (shift, 512, uint64_t, >>) -/* { dg-final { scan-assembler-times {vsrl\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 39 } } */ +/* { dg-final { scan-assembler-times {vsrl\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 42 } } */ /* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c index 8de1b9c0c41..244bee02e55 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c @@ -53,5 +53,5 @@ DEF_OP_VV (shift, 128, int64_t, <<) DEF_OP_VV (shift, 256, int64_t, <<) DEF_OP_VV (shift, 512, int64_t, <<) -/* { dg-final { scan-assembler-times {vsll\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 46 } } */ +/* { dg-final { scan-assembler-times {vsll\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 47 } } */ /* { dg-final { scan-assembler-not {csrr} } } */ -- 2.34.1
[PATCH v1] RISC-V: Bugfix for vls integer mode calling convention
From: Pan Li According to the issue as below. https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/416 When the mode size of vls integer mode is less than 2 * XLEN, we will take the gpr/fpr for both the args and the return values. Instead of the reference. For example the below code: typedef short v8hi __attribute__ ((vector_size (16))); v8hi __attribute__((noinline)) add (v8hi a, v8hi b) { v8hi r = a + b; return r; } Before this patch: add: vsetivli zero,8,e16,m1,ta,ma vle16.v v1,0(a1) <== arg by reference vle16.v v2,0(a2) <== arg by reference vadd.vv v1,v1,v2 vse16.v v1,0(a0) <== return by reference ret After this patch: add: addi sp,sp,-32 sd a0,0(sp) <== arg by register a0 - a3 sd a1,8(sp) sd a2,16(sp) sd a3,24(sp) addi a5,sp,16 vsetivli zero,8,e16,m1,ta,ma vle16.v v2,0(sp) vle16.v v1,0(a5) vadd.vv v1,v1,v2 vse16.v v1,0(sp) ld a0,0(sp) <== return by a0 - a1. ld a1,8(sp) addi sp,sp,32 jr ra For vls floating point, the things get more complicated. We follow the below rules. 1. Vls element count <= 2 and vls size <= 2 * xlen, go fpr. 2. Vls size <= 2 * xlen, go gpr. 3. Vls size > 2 * xlen, go reference. One exceptions is V2DF mode, we treat vls mode as aggregated and we will have TFmode here. Unforturnately, the emit_move_multi_word cannot take care of TFmode elegantly and we go to gpr for V2DF mode. The riscv regression passed for this patch. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_ext_vector_or_tuple_mode_p): New predicate function for vector or tuple vector. (riscv_v_vls_mode_aggregate_reg_count): New function to calculate the gpr/fpr count required by vls mode. (riscv_gpr_unit_size): New function to get gpr in bytes. (riscv_fpr_unit_size): New function to get fpr in bytes. (riscv_v_vls_to_gpr_mode): New function convert vls mode to gpr mode. (riscv_v_vls_to_fpr_mode): New function convert vls mode to fpr mode. (riscv_pass_vls_aggregate_in_gpr_or_fpr): New function to return the rtx of gpr/fpr for vls mode. (riscv_mode_pass_by_reference_p): New predicate function to indicate the mode will be passed by reference or not. (riscv_get_arg_info): Add vls mode handling. (riscv_pass_by_reference): Return false if arg info has no zero gpr count. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/def.h: Add helper marcos. * gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-10.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-8.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-9.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-3.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-4.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-5.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-6.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 185 +- .../rvv/autovec/vls/calling-convention-1.c| 154 +++ .../rvv/autovec/vls/calling-convention-10.c | 51 + .../rvv/autovec/vls/calling-convention-2.c| 142 ++ .../rvv/autovec/vls/calling-convention-3.c| 130 .../rvv/autovec/vls/calling-convention-4.c| 118 +++ .../rvv/autovec/vls/calling-convention-5.c| 141 + .../rvv/autovec/vls/calling-convention-6.c| 129 .../rvv/autovec/vls/calling-convention-7.c| 120 .../rvv/autovec/vls/calling-convention-8.c| 43 .../rvv/autovec/vls/calling-convention-9.c| 51 + .../autovec/vls/calling-convention-run-1.c| 55 ++ .../autovec/vls/calling-convention-run-2.c| 55 ++ .../autovec/vls/calling-convention-run-3.c| 55 ++ .../autovec/vls/calling-convention-run-4.c| 55 ++ .../autovec/vls/calling-convention-run-5.c| 55 ++ .../autovec/vls/calling-convention-run-6.c| 55 ++ .../gcc.target/riscv/rvv/autovec/vls/def.h| 74 +++ 18 files changed, 1665 insertions(+), 3 de
[PATCH v2] RISC-V: Bugfix for vls mode aggregated in GPR calling convention
From: Pan Li According to the issue as below. https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/416 When the mode size of vls integer mode is less than 2 * XLEN, we will take the gpr for both the args and the return values. Instead of the reference. For example the below code: typedef short v8hi __attribute__ ((vector_size (16))); v8hi __attribute__((noinline)) add (v8hi a, v8hi b) { v8hi r = a + b; return r; } Before this patch: add: vsetivli zero,8,e16,m1,ta,ma vle16.v v1,0(a1) <== arg by reference vle16.v v2,0(a2) <== arg by reference vadd.vv v1,v1,v2 vse16.v v1,0(a0) <== return by reference ret After this patch: add: addi sp,sp,-32 sd a0,0(sp) <== arg by register a0 - a3 sd a1,8(sp) sd a2,16(sp) sd a3,24(sp) addi a5,sp,16 vsetivli zero,8,e16,m1,ta,ma vle16.v v2,0(sp) vle16.v v1,0(a5) vadd.vv v1,v1,v2 vse16.v v1,0(sp) ld a0,0(sp) <== return by a0 - a1. ld a1,8(sp) addi sp,sp,32 jr ra For vls floating point, we take the same rules as integer and passed by the gpr or reference. However, we can simplify the above code by vmv, and avoid the read/write values to the stack. We will prepare another patch for it as it isn't the scope of bugfix. The riscv regression passed for this patch. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_vls_mode_aggregate_gpr_count): New function to calculate the gpr count required by vls mode. (riscv_v_vls_to_gpr_mode): New function convert vls mode to gpr mode. (riscv_pass_vls_aggregate_in_gpr): New function to return the rtx of gpr for vls mode. (riscv_get_arg_info): Add vls mode handling. (riscv_pass_by_reference): Return false if arg info has no zero gpr count. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/def.h: Add new helper macro. * gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-10.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-8.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-9.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-3.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-4.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-5.c: New test. * gcc.target/riscv/rvv/autovec/vls/calling-convention-run-6.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 75 + .../rvv/autovec/vls/calling-convention-1.c| 154 ++ .../rvv/autovec/vls/calling-convention-10.c | 51 ++ .../rvv/autovec/vls/calling-convention-2.c| 142 .../rvv/autovec/vls/calling-convention-3.c| 130 +++ .../rvv/autovec/vls/calling-convention-4.c| 118 ++ .../rvv/autovec/vls/calling-convention-5.c| 141 .../rvv/autovec/vls/calling-convention-6.c| 129 +++ .../rvv/autovec/vls/calling-convention-7.c| 118 ++ .../rvv/autovec/vls/calling-convention-8.c| 43 + .../rvv/autovec/vls/calling-convention-9.c| 51 ++ .../autovec/vls/calling-convention-run-1.c| 55 +++ .../autovec/vls/calling-convention-run-2.c| 55 +++ .../autovec/vls/calling-convention-run-3.c| 55 +++ .../autovec/vls/calling-convention-run-4.c| 55 +++ .../autovec/vls/calling-convention-run-5.c| 55 +++ .../autovec/vls/calling-convention-run-6.c| 55 +++ .../gcc.target/riscv/rvv/autovec/vls/def.h| 74 + 18 files changed, 1556 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autove
[PATCH v1] RISC-V: Cleanup the comments for the psabi
From: Pan Li This patch would like to cleanup some comments which are out of date or incorrect. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_get_arg_info): Cleanup comments. (riscv_pass_by_reference): Ditto. (riscv_fntype_abi): Ditto. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 529ef5e84b7..7713ad26c8d 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -5067,8 +5067,7 @@ riscv_get_arg_info (struct riscv_arg_info *info, const CUMULATIVE_ARGS *cum, info->gpr_offset = cum->num_gprs; info->fpr_offset = cum->num_fprs; - /* When disable vector_abi or scalable vector argument is anonymous, this - argument is passed by reference. */ + /* Passed by reference when the scalable vector argument is anonymous. */ if (riscv_v_ext_mode_p (mode) && !named) return NULL_RTX; @@ -5265,8 +5264,9 @@ riscv_pass_by_reference (cumulative_args_t cum_v, const function_arg_info &arg) so we can avoid the call to riscv_get_arg_info in this case. */ if (cum != NULL) { - /* Don't pass by reference if we can use a floating-point register. */ riscv_get_arg_info (&info, cum, arg.mode, arg.type, arg.named, false); + + /* Don't pass by reference if we can use a floating-point register. */ if (info.num_fprs) return false; @@ -5279,9 +5279,9 @@ riscv_pass_by_reference (cumulative_args_t cum_v, const function_arg_info &arg) return false; } - /* When vector abi disabled(without --param=riscv-vector-abi option) or - scalable vector argument is anonymous or cannot be passed through vector - registers, this argument is passed by reference. */ + /* Passed by reference when: + 1. The scalable vector argument is anonymous. + 2. Args cannot be passed through vector registers. */ if (riscv_v_ext_mode_p (arg.mode)) return true; @@ -5392,12 +5392,9 @@ riscv_arguments_is_vector_type_p (const_tree fntype) static const predefined_function_abi & riscv_fntype_abi (const_tree fntype) { - /* Implementing an experimental vector calling convention, the proposal - can be viewed at the bellow link: - https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/389 - - You can enable this feature via the `--param=riscv-vector-abi` compiler - option. */ + /* Implement the vector calling convention. For more details please + reference the below link. + https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/389 */ if (riscv_return_value_is_vector_type_p (fntype) || riscv_arguments_is_vector_type_p (fntype)) return riscv_v_abi (); -- 2.34.1
[PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988
From: Pan Li Refine the test cases for: * Name convention. * Add run case. PR target/112929 PR target/112988 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr112929.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr112929-1.c: ...here. * gcc.target/riscv/rvv/vsetvl/pr112988.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr112988-1.c: ...here. * gcc.target/riscv/rvv/vsetvl/pr112929-2.c: New test. * gcc.target/riscv/rvv/vsetvl/pr112988-2.c: New test. Signed-off-by: Pan Li --- .../rvv/vsetvl/{pr112929.c => pr112929-1.c} | 0 .../gcc.target/riscv/rvv/vsetvl/pr112929-2.c | 57 +++ .../rvv/vsetvl/{pr112988.c => pr112988-1.c} | 0 .../gcc.target/riscv/rvv/vsetvl/pr112988-2.c | 53 + 4 files changed, 110 insertions(+) rename gcc/testsuite/gcc.target/riscv/rvv/vsetvl/{pr112929.c => pr112929-1.c} (100%) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-2.c rename gcc/testsuite/gcc.target/riscv/rvv/vsetvl/{pr112988.c => pr112988-1.c} (100%) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-2.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-1.c similarity index 100% rename from gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929.c rename to gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-1.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-2.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-2.c new file mode 100644 index 000..f2022026639 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-2.c @@ -0,0 +1,57 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -fno-vect-cost-model" } */ + +int printf(char *, ...); +int a, l, i, p, q, t, n, o; +int *volatile c; +static int j; +static struct pack_1_struct d; +long e; +char m = 5; +short s; + +#pragma pack(1) +struct pack_1_struct { + long c; + int d; + int e; + int f; + int g; + int h; + int i; +} h, r = {1}, *f = &h, *volatile g; + +void add_em_up(int count, ...) { + __builtin_va_list ap; + __builtin_va_start(ap, count); + __builtin_va_end(ap); +} + +int main() { + int u; + j = 0; + + for (; j < 9; ++j) { +u = ++t ? a : 0; +if (u) { + int *v = &d.d; + *v = g || e; + *c = 0; + *f = h; +} +s = l && c; +o = i; +d.f || (p = 0); +q |= n; + } + + r = *f; + + add_em_up(1, 1); + printf("%d\n", m); + + if (m != 5) +__builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-1.c similarity index 100% rename from gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988.c rename to gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-1.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-2.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-2.c new file mode 100644 index 000..e952b85b630 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-2.c @@ -0,0 +1,53 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -fno-vect-cost-model" } */ + +int a = 0; +int p, q, r, x = 230; +short d; +int e[256]; +static struct f w; +int *c = &r; + +short y(short z) { + return z * d; +} + +#pragma pack(1) +struct f { + int g; + short h; + int j; + char k; + char l; + long m; + long n; + int o; +} s = {1}, v, t, *u = &v, *b = &s; + +void add_em_up(int count, ...) { + __builtin_va_list ap; + __builtin_va_start(ap, count); + __builtin_va_end(ap); +} + +int main() { + int i = 0; + for (; i < 256; i++) +e[i] = i; + + p = 0; + for (; p <= 0; p++) { +*c = 4; +*u = t; +x |= y(6 >= q); + } + + *b = w; + + add_em_up(1, 1); + + if (a != 0 || q != 0 || p != 1 || r != 4 || x != 0xE6 || d != 0) +__builtin_abort (); + + return 0; +} -- 2.34.1
[PATCH v1] RISC-V: Fix POLY INT handle bug
From: Pan Li This patch fixes the following FAIL: Running target riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 FAIL: gcc.dg/vect/fast-math-vect-complex-3.c execution test The root cause is we generate incorrect codegen for (const_poly_int:DI [549755813888, 549755813888]) Before this patch: li a7,0 vmv.v.x v0,a7 After this patch: csrra2,vlenb sllia2,a2,33 vmv.v.x v0,a2 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_expand_mult_with_const_int): Change int into HOST_WIDE_INT. (riscv_legitimize_poly_move): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/bug-3.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 10 +++-- .../gcc.target/riscv/rvv/autovec/bug-3.c | 39 +++ 2 files changed, 45 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index f60726711e8..3fef1ab1514 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -2371,7 +2371,7 @@ riscv_expand_op (enum rtx_code code, machine_mode mode, rtx op0, rtx op1, static void riscv_expand_mult_with_const_int (machine_mode mode, rtx dest, rtx multiplicand, - int multiplier) + HOST_WIDE_INT multiplier) { if (multiplier == 0) { @@ -2380,7 +2380,7 @@ riscv_expand_mult_with_const_int (machine_mode mode, rtx dest, rtx multiplicand, } bool neg_p = multiplier < 0; - int multiplier_abs = abs (multiplier); + unsigned HOST_WIDE_INT multiplier_abs = abs (multiplier); if (multiplier_abs == 1) { @@ -2475,8 +2475,10 @@ void riscv_legitimize_poly_move (machine_mode mode, rtx dest, rtx tmp, rtx src) { poly_int64 value = rtx_to_poly_int64 (src); - int offset = value.coeffs[0]; - int factor = value.coeffs[1]; + /* It use HOST_WIDE_INT intead of int since 32bit type is not enough + for e.g. (const_poly_int:DI [549755813888, 549755813888]). */ + HOST_WIDE_INT offset = value.coeffs[0]; + HOST_WIDE_INT factor = value.coeffs[1]; int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1]; int div_factor = 0; /* Calculate (const_poly_int:MODE [m, n]) using scalar instructions. diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c new file mode 100644 index 000..643e91b918e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c @@ -0,0 +1,39 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d --param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=scalable -fno-vect-cost-model -O2 -ffast-math" } */ + +#define N 16 + +_Complex float a[N] = +{ 10.0F + 20.0iF, 11.0F + 21.0iF, 12.0F + 22.0iF, 13.0F + 23.0iF, + 14.0F + 24.0iF, 15.0F + 25.0iF, 16.0F + 26.0iF, 17.0F + 27.0iF, + 18.0F + 28.0iF, 19.0F + 29.0iF, 20.0F + 30.0iF, 21.0F + 31.0iF, + 22.0F + 32.0iF, 23.0F + 33.0iF, 24.0F + 34.0iF, 25.0F + 35.0iF }; +_Complex float b[N] = +{ 30.0F + 40.0iF, 31.0F + 41.0iF, 32.0F + 42.0iF, 33.0F + 43.0iF, + 34.0F + 44.0iF, 35.0F + 45.0iF, 36.0F + 46.0iF, 37.0F + 47.0iF, + 38.0F + 48.0iF, 39.0F + 49.0iF, 40.0F + 50.0iF, 41.0F + 51.0iF, + 42.0F + 52.0iF, 43.0F + 53.0iF, 44.0F + 54.0iF, 45.0F + 55.0iF }; + +_Complex float c[N]; +_Complex float res[N] = +{ -500.0F + 1000.0iF, -520.0F + 1102.0iF, + -540.0F + 1208.0iF, -560.0F + 1318.0iF, + -580.0F + 1432.0iF, -600.0F + 1550.0iF, + -620.0F + 1672.0iF, -640.0F + 1798.0iF, + -660.0F + 1928.0iF, -680.0F + 2062.0iF, + -700.0F + 2200.0iF, -720.0F + 2342.0iF, + -740.0F + 2488.0iF, -760.0F + 2638.0iF, + -780.0F + 2792.0iF, -800.0F + 2950.0iF }; + + +void +foo (void) +{ + int i; + + for (i = 0; i < N; i++) +c[i] = a[i] * b[i]; +} + +/* { dg-final { scan-assembler-not {li\s+[a-x0-9]+,\s*0} } } */ +/* { dg-final { scan-assembler-times {slli\s+[a-x0-9]+,\s*[a-x0-9]+,\s*33} 1 } } */ -- 2.34.1
[PATCH v1] RISC-V: Bugfix for the RVV const vector
From: Pan Li This patch would like to fix one bug of const vector for interleave. Assume we need to generate interleave const vector like below. V = {{4, -4, 3, -3, 2, -2, 1, -1,} Before this patch: vsetvl a3, zero, e64, m8, ta, ma vid.v v8v8 = {0, 1, 2, 3, 4} li a6, -1 vmul.vx v8, v8, a6v8 = {-0, -1, -2, -3, -4} vadd.vi v24, v8, 4v24 = { 4, 3, 2, 1, 0} vadd.vi v8, v8, -4v8 = {-4, -5, -6, -7, -8} li a6, 32 vsll.vx v8, v8, a6v8 = {0, -4, 0, -5, 0, -6, 0, -7,} for e32 vor v24, v24, v8 v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32 After this patch: vsetvli a6,zero,e64,m8,ta,ma vidv v8 v8 = {0, 1, 2, 3, 4} li a7,-1 vmul.vx v16,v8,a7 v16 = {-0, -1, -2, -3, -4} vaddvi v16,v16,4 v16 = { 4, 3, 2, 1, 0} vaddvi v8,v8,-4 v8 = {-4, -3, -2, -1, 0} li a7,32 vsll.vx v8,v8,a7 v8 = {0, -4, 0, -3, 0, -2,} for e32 vor.vv v16,v16,v8 v8 = {4, -4, 3, -3, 2, -2,} for e32 gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Take step2 instead of step1 for second series. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/const-vector-0.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 2 +- .../riscv/rvv/autovec/const-vector-0.c| 39 +++ 2 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index eade8db4cf1..d1eb7a0a9a5 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src) rtx tmp2 = gen_reg_rtx (new_mode); base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode); expand_vec_series (tmp2, base2, -gen_int_mode (step1, new_smode)); +gen_int_mode (step2, new_smode)); rtx shifted_tmp2 = expand_simple_binop ( new_mode, ASHIFT, tmp2, gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX, diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c new file mode 100644 index 000..4f83121c663 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c @@ -0,0 +1,39 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d --param=riscv-autovec-lmul=m8 -ftree-vectorize -fno-vect-cost-model -O3 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#define N 4 +struct C { int r, i; }; + +/* +** init_struct_data: +** ... +** vsetivli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m8,\s*ta,\s*ma +** vid\.v\s+v8 +** li\s+[atx][0-9]+,\s*-1 +** vmul\.vx\s+v16,\s*v8,\s*[atx][0-9]+ +** vadd\.vi\s+v16,\s*v16,\s*4 +** vadd\.vi\s+v8,\s*v8,\s*-4 +** li\s+[axt][0-9]+,32 +** vsll\.vx\s+v8,\s*v8,\s*[atx][0-9]+ +** vor\.vv\s+v16,\s*v16,\s*v8 +** ... +*/ +void +init_struct_data (struct C * __restrict a, struct C * __restrict b, + struct C * __restrict c) +{ + int i; + + for (i = 0; i < N; ++i) +{ + a[i].r = N - i; + a[i].i = i - N; + + b[i].r = i - N; + b[i].i = i + N; + + c[i].r = -1 - i; + c[i].i = 2 * N - 1 - i; +} +} -- 2.34.1
[PATCH v2] RISC-V: Bugfix for the RVV const vector
From: Pan Li This patch would like to fix one bug of const vector for interleave. Assume we need to generate interleave const vector like below. V = {{4, -4, 3, -3, 2, -2, 1, -1,} Before this patch: vsetvl a3, zero, e64, m8, ta, ma vid.v v8v8 = {0, 1, 2, 3, 4} li a6, -1 vmul.vx v8, v8, a6v8 = {-0, -1, -2, -3, -4} vadd.vi v24, v8, 4v24 = { 4, 3, 2, 1, 0} vadd.vi v8, v8, -4v8 = {-4, -5, -6, -7, -8} li a6, 32 vsll.vx v8, v8, a6v8 = {0, -4, 0, -5, 0, -6, 0, -7,} for e32 vor v24, v24, v8 v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32 After this patch: vsetvli a6,zero,e64,m8,ta,ma vid.v v8 v8 = {0, 1, 2, 3, 4} li a7,-1 vmul.vx v16,v8,a7 v16 = {-0, -1, -2, -3, -4} vaddvi v16,v16,4 v16 = { 4, 3, 2, 1, 0} vaddvi v8,v8,-4 v8 = {-4, -3, -2, -1, 0} li a7,32 vsll.vx v8,v8,a7 v8 = {0, -4, 0, -3, 0, -2,} for e32 vor.vv v16,v16,v8 v8 = {4, -4, 3, -3, 2, -2,} for e32 It is not easy to add asm check stable enough for this case, as we need to check the vadd -4 target comes from the vid output, which crosses 4 instructions up to point. Thus there is no test here and will be covered by gcc.dg/vect/pr92420.c in the underlying patches. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Take step2 instead of step1 for second series. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index eade8db4cf1..d1eb7a0a9a5 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src) rtx tmp2 = gen_reg_rtx (new_mode); base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode); expand_vec_series (tmp2, base2, -gen_int_mode (step1, new_smode)); +gen_int_mode (step2, new_smode)); rtx shifted_tmp2 = expand_simple_binop ( new_mode, ASHIFT, tmp2, gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX, -- 2.34.1
[PATCH v1] RISC-V: Bugfix for the const vector in single steps
From: Pan Li For generating the const vector with single step, we have code gen similar as below. We have npatterns = 4. v1= {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... } v2 (diff) = {3 - 0, 2 - 1, 1 - 2, 0 - 3, 7 - 4, 6 - 5, 5 - 6, 4 - 7...} = {3, 1, -1, 3, 3, 1, -1, 3 ...} v1 = vd + vid. But this requires the diff is npattern size repeated like {3, 1, -1, 3} as above. And it cannot take care of single step as below: { -4, 4, -4 + 1, 4 + 1, -4 + 2, 4 + 2, -4 + 3, 4 + 3, ... This patch would like to add the restriction to above code gen and implement one for the general case. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Add restriction for the vid-diff code gen and implement general one. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/bug-7.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 73 +++ .../gcc.target/riscv/rvv/autovec/bug-7.c | 61 2 files changed, 119 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 486f5deb296..946588b7b1f 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1257,24 +1257,67 @@ expand_const_vector (rtx target, rtx src) else { /* Generate the variable-length vector following this rule: -{ a, b, a, b, a + step, b + step, a + step*2, b + step*2, ...} - E.g. { 3, 2, 1, 0, 7, 6, 5, 4, ... } */ - /* Step 2: Generate diff = TARGET - VID: -{ 3-0, 2-1, 1-2, 0-3, 7-4, 6-5, 5-6, 4-7, ... }*/ + { a, b, a + step, b + step, a + step*2, b + step*2, ... } */ rvv_builder v (builder.mode (), builder.npatterns (), 1); - for (unsigned int i = 0; i < v.npatterns (); ++i) + poly_int64 ele_0 = rtx_to_poly_int64 (builder.elt (0)); + poly_int64 ele_n + = rtx_to_poly_int64 (builder.elt (v.npatterns ())); + + if (known_eq (ele_0 - 0, ele_n - v.npatterns ())) + { + /* Case 1: For example as below: +{3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... } +We have 3 - 0 = 3 equals 7 - 4 = 3, the sequence is +repeated as below after minus vid. +{3, 1, -1, -3, 3, 1, -1, -3...} +Then we can simplify the diff code gen to at most +npatterns(). */ + + /* Step 1: Generate diff = TARGET - VID. */ + for (unsigned int i = 0; i < v.npatterns (); ++i) + { +poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i; +v.quick_push (gen_int_mode (diff, v.inner_mode ())); + } + + /* Step 2: Generate result = VID + diff. */ + rtx vec = v.build (); + rtx add_ops[] = {target, vid, vec}; + emit_vlmax_insn (code_for_pred (PLUS, builder.mode ()), + BINARY_OP, add_ops); + } + else { - /* Calculate the diff between the target sequence and -vid sequence. The elt (i) can be either const_int or -const_poly_int. */ - poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i; - v.quick_push (gen_int_mode (diff, v.inner_mode ())); + /* Case 2: For example as below: +{ -4, 4, -4 + 1, 4 + 1, -4 + 2, 4 + 2, -4 + 3, 4 + 3, ... } + */ + + /* Step 1: Generate { a, b, a, b, ... } */ + for (unsigned int i = 0; i < v.npatterns (); ++i) + v.quick_push (builder.elt (i)); + rtx new_base = v.build (); + + /* Step 2: Generate tmp = VID >> LOG2 (NPATTERNS). Â */ + rtx shift_count + = gen_int_mode (exact_log2 (builder.npatterns ()), + builder.inner_mode ()); + rtx tmp = expand_simple_binop (builder.mode (), LSHIFTRT, +vid, shift_count, NULL_RTX, +false, OPTAB_DIRECT); + + /* Step 3: Generate tmp2 = tmp * step. Â */ + rtx tmp2 = gen_reg_rtx (builder.mode ()); + rtx step + = simplify_binary_operation (MINUS, builder.inner_mode (), +builder.elt (v.npatterns()), +builder.elt (0)); + expand_vec_series (tmp2, const0_rtx, step, tmp); + + /* Step 4: Generate target = tmp2 + new_base. Â */ + rtx
[PATCH v2] RISC-V: Bugfix for the const vector in single steps
From: Pan Li This patch would like to fix the below execution failure. FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test The will be one single step const vector like { -4, 4, -3, 5, -2, 6, -1, 7, ...}. For such const vector generation with single step, we will generate vid + diff here. For example as below, given npatterns = 4. v1= {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... } v2 (diff) = {3 - 0, 2 - 1, 1 - 2, 0 - 3, 7 - 4, 6 - 5, 5 - 6, 4 - 7...} = {3, 1, -1, 3, 3, 1, -1, 3 ...} v1 = vd + vid. Unfortunately, that cannot work well for { -4, 4, -3, 5, -2, 6, -1, 7, ...} because it has one implicit requirement for the diff. Aka, the diff sequence in npattern are repeated. For example the v2 (diff) as above. The diff between { -4, 4, -3, 5, -2, 6, -1, 7, ...} and vid are not npattern size repeated and then we have wrong code here. We implement one new code gen the sequence like { -4, 4, -3, 5, -2, 6, -1, 7, ...}. The below tests are passed for this patch. * The RV64 regression test with rv64gcv configuration. * The run test gcc.dg/vect/pr92420.c for below configurations. riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Add restriction for the vid-diff code gen and implement general one. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/bug-7.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 84 +++ .../gcc.target/riscv/rvv/autovec/bug-7.c | 61 ++ 2 files changed, 130 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 486f5deb296..5a5899e85ae 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1257,24 +1257,78 @@ expand_const_vector (rtx target, rtx src) else { /* Generate the variable-length vector following this rule: -{ a, b, a, b, a + step, b + step, a + step*2, b + step*2, ...} - E.g. { 3, 2, 1, 0, 7, 6, 5, 4, ... } */ - /* Step 2: Generate diff = TARGET - VID: -{ 3-0, 2-1, 1-2, 0-3, 7-4, 6-5, 5-6, 4-7, ... }*/ + { a, b, a + st
[PATCH v3] RISC-V: Bugfix for the const vector in single steps
From: Pan Li This patch would like to fix the below execution failure when build with "-march=rv64gcv_zvl512b -mabi=lp64d -mcmodel=medlow --param=riscv-autovec-lmul=m8 -ftree-vectorize -fno-vect-cost-model -O3" FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test The will be one single step const vector like { -4, 4, -3, 5, -2, 6, -1, 7, ...}. For such const vector generation with single step, we will generate vid + diff here. For example as below, given npatterns = 4. v1= {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... } v2 (diff) = {3 - 0, 2 - 1, 1 - 2, 0 - 3, 7 - 4, 6 - 5, 5 - 6, 4 - 7...} = {3, 1, -1, 3, 3, 1, -1, 3 ...} v1 = vd + vid. Unfortunately, that cannot work well for { -4, 4, -3, 5, -2, 6, -1, 7, ...} because it has one implicit requirement for the diff. Aka, the diff sequence in npattern are repeated. For example the v2 (diff) as above. The diff between { -4, 4, -3, 5, -2, 6, -1, 7, ...} and vid are not npattern size repeated and then we have wrong code here. We implement one new code gen the sequence like { -4, 4, -3, 5, -2, 6, -1, 7, ...}. The below tests are passed for this patch. * The RV64 regression test with rv64gcv configuration. * The run test gcc.dg/vect/pr92420.c for below configurations. riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax gcc/ChangeLog: * config/riscv/riscv-v.cc (rvv_builder::npatterns_vid_diff_repeated_p): New function to predicate the diff to vid is repeated or not. (expand_const_vector): Add restriction for the vid-diff code gen and implement general one. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/bug-7.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 111 +++--- .../gcc.target/riscv/rvv/autovec/bug-7.c | 61 ++ 2 files changed, 156 insertions(+), 16 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 486f5deb296..3b9be255799 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -433,6 +433,7 @@ public: bool single_step_npatterns_p () const; bool npatterns_all_equal_p () const; bool interleaved_stepped_npatterns_p () const; + bool npatterns_vid_diff_repeated_p
[PATCH v1] RISC-V: XFail the signbit-5 run test for RVV
From: Pan Li This patch would like to XFail the signbit-5 run test case for the RVV. Given the case has one limitation like "This test does not work when the truth type does not match vector type." in the beginning of the test file. Aka, the RVV vector truth type is not integer type. The target board of riscv-sim like below will pick up `-march=rv64gcv` when building the run test elf. Thus, the RVV cannot bypass this test case like aarch64_sve with additional option `-march=armv8-a`. riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`. The signbit-5.c passed test with below configurations. * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zv
[PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
From: Pan Li This patch would like to XFAIL the test case pr30957-1.c for the RVV when build the elf with some configurations (list at the end of the log) It will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as 1. Of course, it will do nothing during unroll_loops when loop->unroll is 1. After this patch the loops vectorized with a variable factor of the RVV will be treated as XFAIL by the tree dump. Aka the blow configuration will be treated as XFAIL and we still need further investigation for the failures of other configurations. * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax gcc/testsuite/ChangeLog: * gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with
[PATCH v2] RISC-V: XFail the signbit-5 run test for RVV
From: Pan Li This patch would like to XFail the signbit-5 run test case for the RVV. Given the case has one limitation like "This test does not work when the truth type does not match vector type." in the beginning of the test file. Aka, the RVV vector truth type is not integer type. The target board of riscv-sim like below will pick up `-march=rv64gcv` when building the run test elf. Thus, the RVV cannot bypass this test case like aarch64_sve with additional option `-march=armv8-a`. riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`. The signbit-5.c passed test with below configurations but we need further investigation for the failures of other configurations. * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-auto
[PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
From: Pan Li This patch would like to XFAIL the test case pr30957-1.c for the RVV when build the elf with some configurations (list at the end of the log) It will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as 1. Of course, it will do nothing during unroll_loops when loop->unroll is 1. The aarch64_sve may have the similar issue but it initialize the const `0.0 / -5.0` in the test file to `+0.0` before pass to the function foo. Then it will pass the execution test. aarch64: moviv0.2s, #0x0 stp x29, x30, [sp, #-16]! mov w0, #0xa mov x29, sp bl 400280 <== s0 is +0.0 Unfortunately, the riscv initialize the the const `0.0 / -5.0` to the `-0.0`, and then pass it to the function foo. Of course it the execution test will fail. riscv: flw fa0,388(gp) # 1299c <__SDATA_BEGIN__+0x4> addisp,sp,-16 li a0,10 sd ra,8(sp) jal 101fc <== fa0 is -0.0 After this patch the loops vectorized with a variable factor of the RVV will be treated as XFAIL by the tree dump when riscv_v and variable_vect_length. The below configurations are validated as XFAIL for RV64. * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax gcc/testsuite/ChangeLog: * gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with variable length. Signed-off-by: Pan Li --- gcc/testsuite/gcc.dg/pr30957-1.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c index 564410913ab..7a7242ec16d 100644 --- a/gcc/testsuite/gcc.dg/pr30957-1.c +++ b/gcc/testsuite/gcc.dg/pr30957-1.c @@ -3,7 +3,7 @@ where each addition is a library call. / /* { dg-require-effective-target hard_float } */ /* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */ -/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */ +/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll -fdump-tree-vect-details" } */ extern void abort (void); extern void exit (int); @@ -34,3 +34,4 @@ main () } /* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */ +/* { dg-f
[PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
From: Pan Li According to the sematics of no-signed-zeros option, the backend like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. Consider below example with option -fno-signed-zeros. void test (float *a) { *a = -0.0; } We will generate code as below, which doesn't treat the minus zero as plus zero. test: lui a5,%hi(.LC0) flw fa5,%lo(.LC0)(a5) fsw fa5,0(a0) ret .LC0: .word -2147483648 // aka -0.0 (0x8000 in hex) This patch would like to fix the bug and treat the minus zero -0.0 as plus zero, aka +0.0. Thus after this patch we will have asm code as below for the above sampe code. test: sw zero,0(a0) ret This patch also fix the run failure of the test case pr30957-1.c. The below tests are passed for this patch. * The riscv regression tests. * The pr30957-1.c run tests. gcc/ChangeLog: * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p for predicating the rtx is const zero float or not. * config/riscv/predicates.md: Ditto. * config/riscv/riscv.cc (riscv_const_insns): Ditto. (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is const zero float or not. (riscv_const_zero_rtx_p): New func impl for predicating the rtx is const zero (both int and fp) or not. * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p): New func decl. (riscv_const_zero_rtx_p): Ditto. * config/riscv/riscv.md: Making sure the operand[1] of movfp is CONST0_RTX when the operand[1] is const zero float. gcc/testsuite/ChangeLog: * gcc.target/riscv/no-signed-zeros-0.c: New test. * gcc.target/riscv/no-signed-zeros-1.c: New test. * gcc.target/riscv/no-signed-zeros-2.c: New test. * gcc.target/riscv/no-signed-zeros-3.c: New test. * gcc.target/riscv/no-signed-zeros-4.c: New test. * gcc.target/riscv/no-signed-zeros-5.c: New test. * gcc.target/riscv/no-signed-zeros-run-0.c: New test. * gcc.target/riscv/no-signed-zeros-run-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/constraints.md | 2 +- gcc/config/riscv/predicates.md| 2 +- gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv.cc | 35 - gcc/config/riscv/riscv.md | 49 --- .../gcc.target/riscv/no-signed-zeros-0.c | 26 ++ .../gcc.target/riscv/no-signed-zeros-1.c | 28 +++ .../gcc.target/riscv/no-signed-zeros-2.c | 26 ++ .../gcc.target/riscv/no-signed-zeros-3.c | 28 +++ .../gcc.target/riscv/no-signed-zeros-4.c | 26 ++ .../gcc.target/riscv/no-signed-zeros-5.c | 28 +++ .../gcc.target/riscv/no-signed-zeros-run-0.c | 36 ++ .../gcc.target/riscv/no-signed-zeros-run-1.c | 36 ++ 13 files changed, 314 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md index de4359af00d..db1d5e1385f 100644 --- a/gcc/config/riscv/constraints.md +++ b/gcc/config/riscv/constraints.md @@ -108,7 +108,7 @@ (define_constraint "DnS" (define_constraint "G" "@internal" (and (match_code "const_double") - (match_test "op == CONST0_RTX (mode)"))) + (match_test "riscv_float_const_zero_rtx_p (op)"))) (define_memory_constraint "A" "An address that is held in a general-purpose register." diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index b87a6900841..b428d842101 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -78,7 +78,7 @@ (define_predicate "sleu_operand" (define_predicate "const_0_operand" (and (match_code "const_int,const_wide_int,const_double,const_vector") - (match_test "op == CONST0_RTX (GET_MODE (op))"))) + (match_test "riscv_const_zero_rtx_p (op)"))) (define_predicate "const_1_operand" (and (match_code "const_int,const_wide_int,const_vector") diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 31049ef7523..fcf30e084a3 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); extern bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int
[PATCH v2] RISC-V: Bugfix for legitimize move when get vec mode in zve32f
From: Pan Li If we want to extract 64bit value but ELEN < 64, we use RVV vector mode with EEW = 32 to extract the highpart and lowpart. However, this approach doesn't honor DFmode when movdf pattern when ZVE32f and of course results in ICE when zve32f. This patch would like to reuse the approach with some additional handing, consider lowpart bits is meaningless for FP mode, we need one int reg as bridge here. For example: rtx tmp = gen_rtx_reg (DImode) reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI ... perform the extract for high and low parts ... reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done PR target/112743 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_move): Take the exist (U *mode) and handle DFmode like DImode when EEW is 32bits like ZVE32F. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr112743-2.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 67 +-- .../gcc.target/riscv/rvv/base/pr112743-2.c| 52 ++ 2 files changed, 99 insertions(+), 20 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index a4fc858fb50..996347ee3fd 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -2605,41 +2605,68 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src) unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 1; scalar_mode smode = as_a (mode); unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size; - unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1; + unsigned int num = (smode == DImode || smode == DFmode) + && !TARGET_VECTOR_ELEN_64 ? 2 : 1; + bool need_int_reg_p = false; if (num == 2) { /* If we want to extract 64bit value but ELEN < 64, we use RVV vector mode with EEW = 32 to extract the highpart and lowpart. */ + need_int_reg_p = smode == DFmode; smode = SImode; nunits = nunits * 2; } - vmode = riscv_vector::get_vector_mode (smode, nunits).require (); - rtx v = gen_lowpart (vmode, SUBREG_REG (src)); - for (unsigned int i = 0; i < num; i++) + opt_machine_mode opt_mode = riscv_vector::get_vector_mode (smode, nunits); + + if (opt_mode.exists (&vmode)) { - rtx result; - if (num == 1) - result = dest; - else if (i == 0) - result = gen_lowpart (smode, dest); - else - result = gen_reg_rtx (smode); - riscv_vector::emit_vec_extract (result, v, index + i); + rtx v = gen_lowpart (vmode, SUBREG_REG (src)); + rtx int_reg = dest; - if (i == 1) + if (need_int_reg_p) { - rtx tmp - = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result), - gen_int_mode (32, Pmode), NULL_RTX, 0, - OPTAB_DIRECT); - rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0, - OPTAB_DIRECT); - emit_move_insn (dest, tmp2); + int_reg = gen_reg_rtx (DImode); + emit_insn ( + gen_movdi (int_reg, gen_lowpart (GET_MODE (int_reg), dest))); + } + + for (unsigned int i = 0; i < num; i++) + { + rtx result; + if (num == 1) + result = int_reg; + else if (i == 0) + result = gen_lowpart (smode, int_reg); + else + result = gen_reg_rtx (smode); + + riscv_vector::emit_vec_extract (result, v, index + i); + + if (i == 1) + { + rtx tmp = expand_binop (Pmode, ashl_optab, + gen_lowpart (Pmode, result), + gen_int_mode (32, Pmode), NULL_RTX, 0, + OPTAB_DIRECT); + rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg, + NULL_RTX, 0, + OPTAB_DIRECT); + emit_move_insn (int_reg, tmp2); + } } + + if (need_int_reg_p) + emit_insn ( + gen_movdf (dest, gen_lowpart (GET_MODE (dest), int_reg))); + else + emit_move_insn (dest, int_reg); } + else + gcc_unreachable (); + return true; } /* Expand diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c new file mode 100644 index 000..fdb35fd70f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c @@ -0,0 +1,52 @@ +/* Test
[PATCH v3] RISC-V: Bugfix for legitimize move when get vec mode in zve32f
From: Pan Li If we want to extract 64bit value but ELEN < 64, we use RVV vector mode with EEW = 32 to extract the highpart and lowpart. However, this approach doesn't honor DFmode when movdf pattern when ZVE32f and of course results in ICE when zve32f. This patch would like to reuse the approach with some additional handing, consider lowpart bits is meaningless for FP mode, we need one int reg as bridge here. For example: rtx tmp = gen_rtx_reg (DImode) reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI ... perform the extract for high and low parts ... reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done PR target/112743 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_move): Take the exist (U *mode) and handle DFmode like DImode when EEW is 32bits for ZVE32F. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr112743-2.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 63 +-- .../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++ 2 files changed, 95 insertions(+), 20 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index a4fc858fb50..2fbaaf01078 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -2605,41 +2605,64 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src) unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 1; scalar_mode smode = as_a (mode); unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size; - unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1; + unsigned int num = (smode == DImode || smode == DFmode) + && !TARGET_VECTOR_ELEN_64 ? 2 : 1; + bool need_int_reg_p = false; if (num == 2) { /* If we want to extract 64bit value but ELEN < 64, we use RVV vector mode with EEW = 32 to extract the highpart and lowpart. */ + need_int_reg_p = smode == DFmode; smode = SImode; nunits = nunits * 2; } - vmode = riscv_vector::get_vector_mode (smode, nunits).require (); - rtx v = gen_lowpart (vmode, SUBREG_REG (src)); - for (unsigned int i = 0; i < num; i++) + if (riscv_vector::get_vector_mode (smode, nunits).exists (&vmode)) { - rtx result; - if (num == 1) - result = dest; - else if (i == 0) - result = gen_lowpart (smode, dest); - else - result = gen_reg_rtx (smode); - riscv_vector::emit_vec_extract (result, v, index + i); + rtx v = gen_lowpart (vmode, SUBREG_REG (src)); + rtx int_reg = dest; - if (i == 1) + if (need_int_reg_p) { - rtx tmp - = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result), - gen_int_mode (32, Pmode), NULL_RTX, 0, - OPTAB_DIRECT); - rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0, - OPTAB_DIRECT); - emit_move_insn (dest, tmp2); + int_reg = gen_reg_rtx (DImode); + emit_move_insn (int_reg, gen_lowpart (GET_MODE (int_reg), dest)); } + + for (unsigned int i = 0; i < num; i++) + { + rtx result; + if (num == 1) + result = int_reg; + else if (i == 0) + result = gen_lowpart (smode, int_reg); + else + result = gen_reg_rtx (smode); + + riscv_vector::emit_vec_extract (result, v, index + i); + + if (i == 1) + { + rtx tmp = expand_binop (Pmode, ashl_optab, + gen_lowpart (Pmode, result), + gen_int_mode (32, Pmode), NULL_RTX, 0, + OPTAB_DIRECT); + rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg, + NULL_RTX, 0, + OPTAB_DIRECT); + emit_move_insn (int_reg, tmp2); + } + } + + if (need_int_reg_p) + emit_move_insn (dest, gen_lowpart (GET_MODE (dest), int_reg)); + else + emit_move_insn (dest, int_reg); } + else + gcc_unreachable (); + return true; } /* Expand diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c new file mode 100644 index 000..fdb35fd70f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c @@ -0,0 +1,52 @@ +/* Test that we do not have ice when compile */ +/* { dg-do compile } */ +/* { dg-options "-march=rv64
[PATCH v4] RISC-V: Bugfix for legitimize move when get vec mode in zve32f
From: Pan Li If we want to extract 64bit value but ELEN < 64, we use RVV vector mode with EEW = 32 to extract the highpart and lowpart. However, this approach doesn't honor DFmode when movdf pattern when ZVE32f and of course results in ICE when zve32f. This patch would like to reuse the approach with some additional handing, consider lowpart bits is meaningless for FP mode, we need one int reg as bridge here. For example: rtx tmp = gen_rtx_reg (DImode) reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI ... perform the extract for high and low parts ... reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done PR target/112743 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_move): Take the exist (U *mode) and handle DFmode like DImode when EEW is 32bits for ZVE32F. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr112743-2.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 63 +-- .../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++ 2 files changed, 95 insertions(+), 20 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index a4fc858fb50..84512dcdc68 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -2605,41 +2605,64 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src) unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 1; scalar_mode smode = as_a (mode); unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size; - unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1; + unsigned int num = known_eq (GET_MODE_SIZE (smode), 8) + && !TARGET_VECTOR_ELEN_64 ? 2 : 1; + bool need_int_reg_p = false; if (num == 2) { /* If we want to extract 64bit value but ELEN < 64, we use RVV vector mode with EEW = 32 to extract the highpart and lowpart. */ + need_int_reg_p = smode == DFmode; smode = SImode; nunits = nunits * 2; } - vmode = riscv_vector::get_vector_mode (smode, nunits).require (); - rtx v = gen_lowpart (vmode, SUBREG_REG (src)); - for (unsigned int i = 0; i < num; i++) + if (riscv_vector::get_vector_mode (smode, nunits).exists (&vmode)) { - rtx result; - if (num == 1) - result = dest; - else if (i == 0) - result = gen_lowpart (smode, dest); - else - result = gen_reg_rtx (smode); - riscv_vector::emit_vec_extract (result, v, index + i); + rtx v = gen_lowpart (vmode, SUBREG_REG (src)); + rtx int_reg = dest; - if (i == 1) + if (need_int_reg_p) { - rtx tmp - = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result), - gen_int_mode (32, Pmode), NULL_RTX, 0, - OPTAB_DIRECT); - rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0, - OPTAB_DIRECT); - emit_move_insn (dest, tmp2); + int_reg = gen_reg_rtx (DImode); + emit_move_insn (int_reg, gen_lowpart (GET_MODE (int_reg), dest)); } + + for (unsigned int i = 0; i < num; i++) + { + rtx result; + if (num == 1) + result = int_reg; + else if (i == 0) + result = gen_lowpart (smode, int_reg); + else + result = gen_reg_rtx (smode); + + riscv_vector::emit_vec_extract (result, v, index + i); + + if (i == 1) + { + rtx tmp = expand_binop (Pmode, ashl_optab, + gen_lowpart (Pmode, result), + gen_int_mode (32, Pmode), NULL_RTX, 0, + OPTAB_DIRECT); + rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg, + NULL_RTX, 0, + OPTAB_DIRECT); + emit_move_insn (int_reg, tmp2); + } + } + + if (need_int_reg_p) + emit_move_insn (dest, gen_lowpart (GET_MODE (dest), int_reg)); + else + emit_move_insn (dest, int_reg); } + else + gcc_unreachable (); + return true; } /* Expand diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c new file mode 100644 index 000..fdb35fd70f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c @@ -0,0 +1,52 @@ +/* Test that we do not have ice when compile */ +/* { dg-do compile } */ +/* { dg-options "-march=rv64g
[PATCH v1] RISC-V: Add test case for bug PR112813
From: Pan Li The bugzilla 112813 has been fixed recently, add below test case for the bug. PR target/112813 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr112813-1.c: New test. Signed-off-by: Pan Li --- .../gcc.target/riscv/rvv/vsetvl/pr112813-1.c | 32 +++ 1 file changed, 32 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112813-1.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112813-1.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112813-1.c new file mode 100644 index 000..5aab9c2bf09 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112813-1.c @@ -0,0 +1,32 @@ +/* Test that we do not have ice when compile */ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv_zvl256b -mabi=ilp32d -O3" } */ + +int a, c, d, f, j; +int b[7]; +long e; +char *g; +int *h; +long long *i; + +void k() { + int l[][1] = {{}, {1}, {1}}; + int *m = &d, *n = &l[0][0]; + + for (; e;) +{ + f = 3; + + for (; f >= 0; f--) + { + *m &= b[f] >= 0; + j = a >= 2 ? 0 : 1 >> a; + *i |= j; +} + + for (; c;) + *g = 0; + } + + h = n; +} -- 2.34.1
[PATCH v1] RISC-V: Fix ICE for incorrect mode attr in V_F2DI_CONVERT_BRIDGE
From: Pan Li The mode attr V_F2DI_CONVERT_BRIDGE converts the floating-point mode to the widden floating-point by design. But we take (RVVM1HF "RVVM2SI") by mistake. This patch would like to fix it by replacing the (RVVM1HF "RVVM2SI") to (RVVM1HF "RVVM2SF") as design. gcc/ChangeLog: * config/riscv/vector-iterators.md: Replace RVVM2SI to RVVM2SF for mode attr V_F2DI_CONVERT_BRIDGE. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/vector-iterators.md | 2 +- .../riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c | 7 +++ 2 files changed, 8 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 56080ed1f5f..5f5f7b5b986 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -3267,7 +3267,7 @@ (define_mode_attr v_f2di_convert [ ]) (define_mode_attr V_F2DI_CONVERT_BRIDGE [ - (RVVM2HF "RVVM4SF") (RVVM1HF "RVVM2SI") (RVVMF2HF "RVVM1SF") + (RVVM2HF "RVVM4SF") (RVVM1HF "RVVM2SF") (RVVMF2HF "RVVM1SF") (RVVMF4HF "RVVMF2SF") (RVVM4SF "VOID") (RVVM2SF "VOID") (RVVM1SF "VOID") diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c new file mode 100644 index 000..5fb61c7b44c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c @@ -0,0 +1,7 @@ +/* Test that we do not have ice when compile */ +/* { dg-do compile } */ +/* { dg-options "--param=riscv-autovec-lmul=m4 -march=rv64gcv_zvfh_zfh -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "test-math.h" + +TEST_UNARY_CALL_CVT (_Float16, long, __builtin_lroundf16) -- 2.34.1
[PATCH v1] RISC-V: Disable RVV VCOMPRESS avl propagation
From: Pan Li This patch would like to disable the avl propagation for the follow reasons. According to the ISA, the first vl elements of vector register group vs2 should be extracted and packed for vcompress. And the highest element of vs2 vector may be touched by the mask, which may be eliminated by avl propagation. For example, given original vl = 4 here. We have: v0 = 0b1000 v1 = {0x1, 0x2, 0x3, 0x4} v2 = {0x5, 0x6, 0x7, 0x8} Then: vcompress v1, v2, v0 (avl = 4), v1 = {0x8, 0x2, 0x3, 0x4}. <== Correct. vcompress v1, v2, v0 (avl = 2), v1 will be unchanged. <== Wrong. Finally, we cannot propagate avl of vcompress because it may has senmatics change to the result. This patch also fix the failure of gcc.c-torture/execute/990128-1.c for the following configurations. riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax gcc/ChangeLog: * config/riscv/riscv-avlprop.cc (avl_can_be_propagated_p): Disable the avl propogation for the vcompress. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-avlprop.cc | 35 -- .../rvv/autovec/binop/vcompress-avlprop-1.c | 36 +++ 2 files changed, 61 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c diff --git a/gcc/config/riscv/riscv-avlprop.cc b/gcc/config/riscv/riscv-avlprop.cc index 02f006742f1..a6159816cf7 100644 --- a/gcc/config/riscv/riscv-avlprop.cc +++ b/gcc/config/riscv/riscv-avlprop.cc @@ -113,19 +113,34 @@ avl_can_be_propagated_p (rtx_insn *rinsn) touching the element with i > AVL. So, we don't do AVL propagation on these following situations: - - The index of "vrgather dest, source, index" may pick up the -element which has index >= AVL, so we can't strip the elements -that has index >= AVL of source register. - - The last element of vslide1down is AVL + 1 according to RVV ISA: -vstart <= i < vl-1vd[i] = vs2[i+1] if v0.mask[i] enabled - - The last multiple elements of vslidedown can be the element -has index >= AVL according to RVV ISA: -0 <= i+OFFSET < VLMAX src[i] = vs2[i+OFFSET] -vstart <= i < vl vd[i] = s
[PATCH v1] RISC-V: Support FP ceil to i/l/ll diff size autovec
From: Pan Li This patch would like to support the FP below API auto vectorization with different type size +-+---+--+ | API | RV64 | RV32 | +-+---+--+ | iceil | DF => SI | DF => SI | | iceilf | - | -| | lceil | - | DF => SI | | lceilf | SF => DI | -| | llceil | - | -| | llceilf | SF => DI | SF => DI | +-+---+--+ Given below code: void test_lceilf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lceilf (in[i]); } Before this patch: .L3: flw fa0,0(s0) addi s0,s0,4 addi s1,s1,8 call ceilf fcvt.l.s a5,fa0,rtz sd a5,-8(s1) bne s2,s0,.L3 ld ra,24(sp) ld s0,16(sp) ld s1,8(sp) ld s2,0(sp) addi sp,sp,32 jr ra After this patch: fsrmi3 // RUP mode .L3: vsetvli a5,a2,e32,mf2,ta,ma vle32.v v2,0(a1) slli a3,a5,2 slli a4,a5,3 vfwcvt.x.f.v v1,v2 sub a2,a2,a5 vse64.v v1,0(a0) add a1,a1,a3 add a0,a0,a4 bne a2,zero,.L3 Unfortunately, the HF mode is not include due to it requires additional middle-end support from internal-fun.def. gcc/ChangeLog: * config/riscv/autovec.md: Remove the size check of lceil.l * config/riscv/riscv-v.cc (expand_vec_lceil): Leverage emit_vec_rounding_to_integer for ceil. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-iceil-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-iceil-run-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llceilf-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llceilf-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-iceil-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lceil-rv32-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lceilf-rv64-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-llceilf-0.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 6 +- gcc/config/riscv/riscv-v.cc | 8 +- .../riscv/rvv/autovec/unop/math-iceil-1.c | 18 .../riscv/rvv/autovec/unop/math-iceil-run-1.c | 83 ++ .../rvv/autovec/unop/math-lceil-rv32-0.c | 18 .../rvv/autovec/unop/math-lceil-rv32-run-0.c | 83 ++ .../rvv/autovec/unop/math-lceilf-rv64-0.c | 18 .../rvv/autovec/unop/math-lceilf-rv64-run-0.c | 84 +++ .../riscv/rvv/autovec/unop/math-llceilf-0.c | 19 + .../rvv/autovec/unop/math-llceilf-run-0.c | 84 +++ .../riscv/rvv/autovec/vls/math-iceil-1.c | 27 ++ .../riscv/rvv/autovec/vls/math-lceil-rv32-0.c | 27 ++ .../rvv/autovec/vls/math-lceilf-rv64-0.c | 27 ++ .../riscv/rvv/autovec/vls/math-llceilf-0.c| 27 ++ 14 files changed, 520 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceilf-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceilf-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-iceil-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lceil-rv32-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lceilf-rv64-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llceilf-0.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 5b5105f5b46..b59bb880a45 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2466,8 +2466,7 @@ (define_expand "lround2" (define_expand "lceil2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math -&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (mode))" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" { riscv_vect
[PATCH v1] ISC-V: Support FP floor to i/l/ll diff size autovec
From: Pan Li This patch would like to support the FP below API auto vectorization with different type size +--+---+--+ | API | RV64 | RV32 | +--+---+--+ | ifloor | DF => SI | DF => SI | | ifloorf | - | -| | lfloor | - | DF => SI | | lfloorf | SF => DI | -| | llfloor | - | -| | llfloorf | SF => DI | SF => DI | +--+---+--+ Given below code: void test_lfloorf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lceilf (in[i]); } Before this patch: .L3: flw fa0,0(s0) addi s0,s0,4 addi s1,s1,8 call floorf fcvt.l.s a5,fa0,rtz sd a5,-8(s1) bne s2,s0,.L3 After this patch: fsrmi2 // RDN mode .L3: vsetvli a5,a2,e32,mf2,ta,ma vle32.v v2,0(a1) slli a3,a5,2 slli a4,a5,3 vfwcvt.x.f.v v1,v2 sub a2,a2,a5 vse64.v v1,0(a0) add a1,a1,a3 add a0,a0,a4 bne a2,zero,.L3 Unfortunately, the HF mode is not include due to it requires additional middle-end support from internal-fun.def. gcc/ChangeLog: * config/riscv/autovec.md: Remove the size check of lfloor. * config/riscv/riscv-v.cc (expand_vec_lfloor): Leverage emit_vec_rounding_to_integer for floor. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-ifloor-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lfloor-rv32-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lfloor-rv32-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lfloorf-rv64-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-lfloorf-rv64-run-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llfloorf-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/math-llfloorf-run-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-ifloor-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lfloor-rv32-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-lfloorf-rv64-0.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-llfloorf-0.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 6 +- gcc/config/riscv/riscv-v.cc | 8 +- .../riscv/rvv/autovec/unop/math-ifloor-1.c| 18 .../rvv/autovec/unop/math-ifloor-run-1.c | 83 ++ .../rvv/autovec/unop/math-lfloor-rv32-0.c | 18 .../rvv/autovec/unop/math-lfloor-rv32-run-0.c | 83 ++ .../rvv/autovec/unop/math-lfloorf-rv64-0.c| 18 .../autovec/unop/math-lfloorf-rv64-run-0.c| 84 +++ .../riscv/rvv/autovec/unop/math-llfloorf-0.c | 19 + .../rvv/autovec/unop/math-llfloorf-run-0.c| 84 +++ .../riscv/rvv/autovec/vls/math-ifloor-1.c | 27 ++ .../rvv/autovec/vls/math-lfloor-rv32-0.c | 27 ++ .../rvv/autovec/vls/math-lfloorf-rv64-0.c | 27 ++ .../riscv/rvv/autovec/vls/math-llfloorf-0.c | 27 ++ 14 files changed, 520 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-rv32-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-rv32-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloorf-rv64-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloorf-rv64-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloorf-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloorf-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-ifloor-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lfloor-rv32-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lfloorf-rv64-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llfloorf-0.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index b59bb880a45..973dc4ac235 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2486,8 +2486,7 @@ (define_expand "lceil2" (define_expand "lfloor2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math -&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (mode))" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" { riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, mode); DONE; @@ -2
[PATCH v2] DSE: Allow vector type for get_stored_val when read < store
From: Pan Li Update in v2: * Move vector type support to get_stored_val. Original log: This patch would like to allow the vector mode in the get_stored_val in the DSE. It is valid for the read rtx if and only if the read bitsize is less than the stored bitsize. Given below example code with --param=riscv-autovec-preference=fixed-vlmax. vuint8m1_t test () { uint8_t arr[32] = { 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9, 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9, }; return __riscv_vle8_v_u8m1(arr, 32); } Before this patch: test: lui a5,%hi(.LANCHOR0) addisp,sp,-32 addia5,a5,%lo(.LANCHOR0) li a3,32 vl2re64.v v2,0(a5) vsetvli zero,a3,e8,m1,ta,ma vs2r.v v2,0(sp) <== Unnecessary store to stack vle8.v v1,0(sp) <== Ditto vs1r.v v1,0(a0) addisp,sp,32 jr ra After this patch: test: lui a5,%hi(.LANCHOR0) addia5,a5,%lo(.LANCHOR0) li a4,32 addisp,sp,-32 vsetvli zero,a4,e8,m1,ta,ma vle8.v v1,0(a5) vs1r.v v1,0(a0) addisp,sp,32 jr ra Below tests are passed within this patch: * The x86 bootstrap and regression test. * The aarch64 regression test. * The risc-v regression test. PR target/111720 gcc/ChangeLog: * dse.cc (get_stored_val): Allow vector mode if the read bitsize is less than stored bitsize. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr111720-0.c: New test. * gcc.target/riscv/rvv/base/pr111720-1.c: New test. * gcc.target/riscv/rvv/base/pr111720-10.c: New test. * gcc.target/riscv/rvv/base/pr111720-2.c: New test. * gcc.target/riscv/rvv/base/pr111720-3.c: New test. * gcc.target/riscv/rvv/base/pr111720-4.c: New test. * gcc.target/riscv/rvv/base/pr111720-5.c: New test. * gcc.target/riscv/rvv/base/pr111720-6.c: New test. * gcc.target/riscv/rvv/base/pr111720-7.c: New test. * gcc.target/riscv/rvv/base/pr111720-8.c: New test. * gcc.target/riscv/rvv/base/pr111720-9.c: New test. Signed-off-by: Pan Li --- gcc/dse.cc| 4 .../gcc.target/riscv/rvv/base/pr111720-0.c| 18 .../gcc.target/riscv/rvv/base/pr111720-1.c| 18 .../gcc.target/riscv/rvv/base/pr111720-10.c | 18 .../gcc.target/riscv/rvv/base/pr111720-2.c| 18 .../gcc.target/riscv/rvv/base/pr111720-3.c| 18 .../gcc.target/riscv/rvv/base/pr111720-4.c| 18 .../gcc.target/riscv/rvv/base/pr111720-5.c| 18 .../gcc.target/riscv/rvv/base/pr111720-6.c| 18 .../gcc.target/riscv/rvv/base/pr111720-7.c| 21 +++ .../gcc.target/riscv/rvv/base/pr111720-8.c| 18 .../gcc.target/riscv/rvv/base/pr111720-9.c| 15 + 12 files changed, 202 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c diff --git a/gcc/dse.cc b/gcc/dse.cc index 1a85dae1f8c..21004becd4a 100644 --- a/gcc/dse.cc +++ b/gcc/dse.cc @@ -1940,6 +1940,10 @@ get_stored_val (store_info *store_info, machine_mode read_mode, || GET_MODE_CLASS (read_mode) != GET_MODE_CLASS (store_mode))) read_reg = extract_low_bits (read_mode, store_mode, copy_rtx (store_info->const_rhs)); + else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode) +&& known_lt (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode)) +&& targetm.modes_tieable_p (read_mode, store_mode)) +read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs)); else read_reg = extract_low_bits (read_mode, store_mode, copy_rtx (store_info->rhs)); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c new file mode 100644 index 000..a61e94a6d98 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */ + +#include "riscv
[PATCH v1] RISC-V: Refine frm emit after bb end in succ edges
From: Pan Li This patch would like to fine the frm insn emit when we meet abnormal edge in the loop. Conceptually, we only need to emit once when abnormal instead of every iteration in the loop. This patch would like to fix this defect and only perform insert_insn_end_basic_block when at least one succ edge is abnormal. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_frm_emit_after_bb_end): Only perform once emit when at least one succ edge is abnormal. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 21 + 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 08ff05dcc3f..e25692b86fc 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -9348,20 +9348,33 @@ static void riscv_frm_emit_after_bb_end (rtx_insn *cur_insn) { edge eg; + bool abnormal_edge_p = false; edge_iterator eg_iterator; basic_block bb = BLOCK_FOR_INSN (cur_insn); FOR_EACH_EDGE (eg, eg_iterator, bb->succs) +{ + if (eg->flags & EDGE_ABNORMAL) + abnormal_edge_p = true; + else + { + start_sequence (); + emit_insn (gen_frrmsi (DYNAMIC_FRM_RTL (cfun))); + rtx_insn *backup_insn = get_insns (); + end_sequence (); + + insert_insn_on_edge (backup_insn, eg); + } +} + + if (abnormal_edge_p) { start_sequence (); emit_insn (gen_frrmsi (DYNAMIC_FRM_RTL (cfun))); rtx_insn *backup_insn = get_insns (); end_sequence (); - if (eg->flags & EDGE_ABNORMAL) - insert_insn_end_basic_block (backup_insn, bb); - else - insert_insn_on_edge (backup_insn, eg); + insert_insn_end_basic_block (backup_insn, bb); } commit_edge_insertions (); -- 2.34.1
[PATCH v1] Internal-fn: Add FLOATN support for l/ll round and rint [PR/112432]
From: Pan Li The defined DEF_EXT_LIB_FLOATN_NX_BUILTINS functions should also have DEF_INTERNAL_FLT_FLOATN_FN instead of DEF_INTERNAL_FLT_FN for the FLOATN support. According to the glibc API and gcc builtin, we have below table for the FLOATN is supported or not. +-+---+-+ | | glibc | gcc: DEF_EXT_LIB_FLOATN_NX_BUILTINS | +-+---+-+ | iceil | N | N | | ifloor | N | N | | irint | N | N | | iround | N | N | | lceil | N | N | | lfloor | N | N | | lrint | Y | Y | | lround | Y | Y | | llceil | N | N | | llfllor | N | N | | llrint | Y | Y | | llround | Y | Y | +-+---+-+ This patch would like to support FLOATN for: 1. lrint 2. lround 3. llrint 4. llround The below tests are passed within this patch: 1. x86 bootstrap and regression test. 2. aarch64 regression test. 3. riscv regression tests. PR target/112432 gcc/ChangeLog: * internal-fn.def (LRINT): Add FLOATN support. (LROUND): Ditto. (LLRINT): Ditto. (LLROUND): Ditto. Signed-off-by: Pan Li --- gcc/internal-fn.def | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7f0e3759615..10f88e37bc9 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -365,12 +365,12 @@ DEF_INTERNAL_FLT_FN (IRINT, ECF_CONST, lrint, unary_convert) DEF_INTERNAL_FLT_FN (IROUND, ECF_CONST, lround, unary_convert) DEF_INTERNAL_FLT_FN (LCEIL, ECF_CONST, lceil, unary_convert) DEF_INTERNAL_FLT_FN (LFLOOR, ECF_CONST, lfloor, unary_convert) -DEF_INTERNAL_FLT_FN (LRINT, ECF_CONST, lrint, unary_convert) -DEF_INTERNAL_FLT_FN (LROUND, ECF_CONST, lround, unary_convert) +DEF_INTERNAL_FLT_FLOATN_FN (LRINT, ECF_CONST, lrint, unary_convert) +DEF_INTERNAL_FLT_FLOATN_FN (LROUND, ECF_CONST, lround, unary_convert) DEF_INTERNAL_FLT_FN (LLCEIL, ECF_CONST, lceil, unary_convert) DEF_INTERNAL_FLT_FN (LLFLOOR, ECF_CONST, lfloor, unary_convert) -DEF_INTERNAL_FLT_FN (LLRINT, ECF_CONST, lrint, unary_convert) -DEF_INTERNAL_FLT_FN (LLROUND, ECF_CONST, lround, unary_convert) +DEF_INTERNAL_FLT_FLOATN_FN (LLRINT, ECF_CONST, lrint, unary_convert) +DEF_INTERNAL_FLT_FLOATN_FN (LLROUND, ECF_CONST, lround, unary_convert) /* FP rounding. */ DEF_INTERNAL_FLT_FLOATN_FN (CEIL, ECF_CONST, ceil, unary) -- 2.34.1
[PATCH v1] RISC-V: Add HFmode for l/ll round and rint autovec
From: Pan Li The internal-fn has support the FLOATN already. This patch would like to re-enable the vector HFmode for the autovec for below standard name mode iterators. 1. lrint 2. llround For now the vector HFmodes are disabled to limit the impact, and the underlying FP16 rint/round autovec will enable this one by one. gcc/ChangeLog: * config/riscv/autovec.md: Disable vector HFmode for rint, round, ceil and floor. * config/riscv/vector-iterators.md: Add vector HFmode for rint, round, ceil and floor mode iterator. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 26 +++- gcc/config/riscv/vector-iterators.md | 59 +++- 2 files changed, 73 insertions(+), 12 deletions(-) diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 33722ea1139..a199caabf87 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2443,12 +2443,11 @@ (define_expand "roundeven2" } ) -;; Add mode_size equal check as we opened the modes for different sizes. -;; The check will be removed soon after related codegen implemented (define_expand "lrint2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& GET_MODE_INNER (mode) != HFmode" { riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); DONE; @@ -2458,7 +2457,8 @@ (define_expand "lrint2" (define_expand "lrint2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& GET_MODE_INNER (mode) != HFmode" { riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, mode); DONE; @@ -2468,7 +2468,8 @@ (define_expand "lrint2" (define_expand "lround2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& GET_MODE_INNER (mode) != HFmode" { riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); DONE; @@ -2478,7 +2479,8 @@ (define_expand "lround2" (define_expand "lround2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& GET_MODE_INNER (mode) != HFmode" { riscv_vector::expand_vec_lround (operands[0], operands[1], mode, mode); DONE; @@ -2488,7 +2490,8 @@ (define_expand "lround2" (define_expand "lceil2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& GET_MODE_INNER (mode) != HFmode" { riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, mode); DONE; @@ -2498,7 +2501,8 @@ (define_expand "lceil2" (define_expand "lceil2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& GET_MODE_INNER (mode) != HFmode" { riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, mode); DONE; @@ -2508,7 +2512,8 @@ (define_expand "lceil2" (define_expand "lfloor2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& GET_MODE_INNER (mode) != HFmode" { riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, mode); DONE; @@ -2518,7 +2523,8 @@ (define_expand "lfloor2" (define_expand "lfloor2" [(match_operand: 0 "register_operand") (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")] - "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math +&& GET_MODE_INNER (mode) != HFmode" { riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, mode); DONE; diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index e80eaedc4b3..f2d9f60b631 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -3221,15 +3221,20 @@ (define_mode_attr vnnconvert [ ;; V_F2SI_CONVERT: (HF, SF, DF) => SI ;; V_F2DI_CONVERT: (HF, SF, DF) => DI ;; -;; HF requires addit