[PATCH v2] RISC-V: Support {U}INT64 to FP16 auto-vectorization

2023-09-28 Thread pan2 . li
From: Pan Li 

Update in v2:

* Add math trap check.
* Adjust some test cases.

Original logs:

This patch would like to support the auto-vectorization from
the INT64 to FP16. We take below steps for the conversion.

* INT64 to FP32.
* FP32 to FP16.

Given sample code as below:
void
test_func (int64_t * __restrict a, _Float16 *b, unsigned n)
{
  for (unsigned i = 0; i < n; i++)
b[i] = (_Float16) (a[i]);
}

Before this patch:
test.c:6:26: missed: couldn't vectorize loop
test.c:6:26: missed: not vectorized: unsupported data-type
ld  a0,0(s0)
call__floatdihf
fsh fa0,0(s1)
addis0,s0,8
addis1,s1,2
bne s2,s0,.L3
ld  ra,24(sp)
ld  s0,16(sp)
ld  s1,8(sp)
ld  s2,0(sp)
addisp,sp,32

After this patch:
vsetvli a5,a2,e8,mf8,ta,ma
vle64.v v1,0(a0)
vsetvli a4,zero,e32,mf2,ta,ma
vfncvt.f.x.wv1,v1
vsetvli zero,zero,e16,mf4,ta,ma
vfncvt.f.f.wv1,v1
vsetvli zero,a2,e16,mf4,ta,ma
vse16.v v1,0(a1)

Please note VLS mode is also involved in this patch and covered by the
test cases.

PR target/111506

gcc/ChangeLog:

* config/riscv/autovec.md (2):
New pattern.
* config/riscv/vector-iterators.md: New iterator.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/cvt-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/cvt-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cvt-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   | 24 ++
 gcc/config/riscv/vector-iterators.md  | 38 +++
 .../gcc.target/riscv/rvv/autovec/unop/cvt-0.c | 21 +
 .../gcc.target/riscv/rvv/autovec/unop/cvt-1.c | 22 +
 .../gcc.target/riscv/rvv/autovec/vls/cvt-0.c  | 47 +++
 5 files changed, 152 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/cvt-0.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index cd0cbdd2889..d6cf376ebca 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -974,6 +974,30 @@ (define_insn_and_split "2"
 }
 [(set_attr "type" "vfncvtitof")])
 
+;; This operation can be performed in the loop vectorizer but unfortunately
+;; not applicable for now. We can remove this pattern after loop vectorizer
+;; is able to take care of INT64 to FP16 conversion.
+(define_insn_and_split "2"
+  [(set (match_operand:  0 "register_operand")
+   (any_float:
+ (match_operand:VWWCONVERTI 1 "register_operand")))]
+  "TARGET_VECTOR && TARGET_ZVFH && can_create_pseudo_p () && 
!flag_trapping_math"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+rtx single = gen_reg_rtx (mode); /* Get vector SF mode.  */
+
+/* Step-1, INT64 => FP32.  */
+emit_insn (gen_2 (single, operands[1]));
+/* Step-2, FP32 => FP16.  */
+emit_insn (gen_trunc2 (operands[0], single));
+
+DONE;
+  }
+  [(set_attr "type" "vfncvtitof")]
+)
+
 ;; =
 ;; == Unary arithmetic
 ;; =
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index b6cd872eb42..c9a7344b1bc 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -1247,6 +1247,24 @@ (define_mode_iterator VWCONVERTI [
   (V512DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && 
TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 4096")
 ])
 
+(define_mode_iterator VWWCONVERTI [
+  (RVVM8DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (RVVM4DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (RVVM2DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (RVVM1DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+
+  (V1DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (V2DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (V4DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (V8DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 64")
+  (V16DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 128")
+  (V32DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 256")
+  (V64DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 512")
+  (V128DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 1024")
+  (V256DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 2048")
+  (V512DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 4096")
+])
+
 (define_mode_iterator VQEXTI [
   RVVM8SI RVVM4SI 

[PATCH v1] RISC-V: Update comments for FP rounding related autovec

2023-10-05 Thread pan2 . li
From: Pan Li 

Some comment is out of date, this patch would like to fix it.

gcc/ChangeLog:

* config/riscv/autovec.md: Update comments.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 056f2c352f6..53e9d34eea1 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2229,12 +2229,16 @@ (define_expand "avg3_ceil"
 })
 
 ;; -
-;;  [FP] Math.h.
+;;  [FP] Rounding.
 ;; -
 ;; Includes:
 ;; - ceil/ceilf
 ;; - floor/floorf
 ;; - nearbyint/nearbyintf
+;; - rint/rintf
+;; - round/roundf
+;; - trunc/truncf
+;; - roundeven/roundevenf
 ;; -
 (define_expand "ceil2"
   [(match_operand:V_VLSF 0 "register_operand")
-- 
2.34.1



[PATCH v1] RISC-V: Bugfix for legitimize address PR/111634

2023-10-06 Thread pan2 . li
From: Pan Li 

Given we have RTL as below.

(plus:DI (mult:DI (reg:DI 138 [ g.4_6 ])
  (const_int 8 [0x8]))
 (lo_sum:DI (reg:DI 167)
(symbol_ref:DI ("f") [flags 0x86] )
))

When handling (plus (plus (mult (a) (mem_shadd_constant)) (fp)) (C)) case,
the fp will be the lo_sum operand as above. We have assumption that the fp
is reg but actually not here. It will have ICE when building with option
--enable-checking=rtl.

This patch would like to fix it by adding the REG_P to ensure the operand
is a register. The test case gcc/testsuite/gcc.dg/pr109417.c covered this
fix when build with --enable-checking=rtl.

PR target/111634

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_address): Bugfix.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index d5446b63dbf..2b839241f1a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2042,7 +2042,7 @@ riscv_legitimize_address (rtx x, rtx oldx 
ATTRIBUTE_UNUSED,
{
  rtx index = XEXP (base, 0);
  rtx fp = XEXP (base, 1);
- if (REGNO (fp) == VIRTUAL_STACK_VARS_REGNUM)
+ if (REG_P (fp) && REGNO (fp) == VIRTUAL_STACK_VARS_REGNUM)
{
 
  /* If we were given a MULT, we must fix the constant
-- 
2.34.1



[PATCH v1] RISC-V: Add more run test for FP rounding autovec

2023-10-06 Thread pan2 . li
From: Pan Li 

For _Float16 types, add run test for:
* ceil
* floor
* nearbyint
* rint
* round
* roundeven
* trunc

For float and double, add run test for:
* roundeven

The zfa extension is required for these run test cases, the simulation
target_board may look like below for rv64.

target_board="riscv-sim/-march=rv64gcv_zfa_zfh/-mabi=lp64d/-mcmodel=medlow"

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Add zfa for building.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-rint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-round-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-run-0.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-ceil-run-0.c  | 39 +++
 .../riscv/rvv/autovec/unop/math-floor-run-0.c | 39 +++
 .../rvv/autovec/unop/math-nearbyint-run-0.c   | 48 +++
 .../riscv/rvv/autovec/unop/math-rint-run-0.c  | 48 +++
 .../riscv/rvv/autovec/unop/math-round-run-0.c | 39 +++
 .../rvv/autovec/unop/math-roundeven-run-0.c   | 39 +++
 .../rvv/autovec/unop/math-roundeven-run-1.c   | 39 +++
 .../rvv/autovec/unop/math-roundeven-run-2.c   | 39 +++
 .../riscv/rvv/autovec/unop/math-trunc-run-0.c | 39 +++
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp|  4 +-
 10 files changed, 371 insertions(+), 2 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-rint-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-round-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-roundeven-run-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-trunc-run-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c
new file mode 100644
index 000..70cba3602bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c
@@ -0,0 +1,39 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+_Float16 in[ARRAY_SIZE];
+_Float16 out[ARRAY_SIZE];
+_Float16 ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL (_Float16, __builtin_ceilf16)
+TEST_ASSERT (_Float16)
+
+TEST_INIT (_Float16, 1.2, 2.0, 1)
+TEST_INIT (_Float16, -1.2, -1.0, 2)
+TEST_INIT (_Float16, 3.0, 3.0, 3)
+TEST_INIT (_Float16, 1023.5, 1024.0, 4)
+TEST_INIT (_Float16, 1024.0, 1024.0, 5)
+TEST_INIT (_Float16, 0.0, 0.0, 6)
+TEST_INIT (_Float16, -0.0, -0.0, 7)
+TEST_INIT (_Float16, -1023.5, -1023.0, 8)
+TEST_INIT (_Float16, -1024.0, -1024.0, 9)
+
+int
+main ()
+{
+  RUN_TEST (_Float16, 1, __builtin_ceilf16, in, out, ref, ARRAY_SIZE);
+  RUN_TEST (_Float16, 2, __builtin_ceilf16, in, out, ref, ARRAY_SIZE);
+  RUN_TEST (_Float16, 3, __builtin_ceilf16, in, out, ref, ARRAY_SIZE);
+  RUN_TEST (_Float16, 4, __builtin_ceilf16, in, out, ref, ARRAY_SIZE);
+  RUN_TEST (_Float16, 5, __builtin_ceilf16, in, out, ref, ARRAY_SIZE);
+  RUN_TEST (_Float16, 6, __builtin_ceilf16, in, out, ref, ARRAY_SIZE);
+  RUN_TEST (_Float16, 7, __builtin_ceilf16, in, out, ref, ARRAY_SIZE);
+  RUN_TEST (_Float16, 8, __builtin_ceilf16, in, out, ref, ARRAY_SIZE);
+  RUN_TEST (_Float16, 9, __builtin_ceilf16, in, out, ref, ARRAY_SIZE);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c
new file mode 100644
index 000..c542278c1f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-run-0.c
@@ -0,0 +1,39 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+_Float16 in[ARRAY_SIZE];
+_Float16 out[ARRAY_SIZE];
+_Float16 ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL (_Float16, __built

[PATCH v1] RISC-V: Refine bswap16 auto vectorization code gen

2023-10-09 Thread pan2 . li
From: Pan Li 

This patch would like to refine the code gen for the bswap16.

We will have VEC_PERM_EXPR after rtl expand when invoking
__builtin_bswap. It will generate about 9 instructions in
loop as below, no matter it is bswap16, bswap32 or bswap64.

  .L2:
1 vle16.v v4,0(a0)
2 vmv.v.x v2,a7
3 vand.vv v2,v6,v2
4 sllia2,a5,1
5 vrgatherei16.vv v1,v4,v2
6 sub a4,a4,a5
7 vse16.v v1,0(a3)
8 add a0,a0,a2
9 add a3,a3,a2
  bne a4,zero,.L2

But for bswap16 we may have a even simple code gen, which
has only 7 instructions in loop as below.

  .L5
1 vle8.v  v2,0(a5)
2 addia5,a5,32
3 vsrl.vi v4,v2,8
4 vsll.vi v2,v2,8
5 vor.vv  v4,v4,v2
6 vse8.v  v4,0(a4)
7 addia4,a4,32
  bne a5,a6,.L5

Unfortunately, this way will make the insn in loop will grow up to
13 and 24 for bswap32 and bswap64. Thus, we will refine the code
gen for the bswap16 only, and leave both the bswap32 and bswap64
as is.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vec_sll_scalar): New help func
impl for emit vsll.vi/vsll.vx
(emit_vec_srl_scalar): Likewise for vsrl.vi/vsrl.vx.
(emit_vec_or): Likewise for vor.vv.
(shuffle_bswap_pattern): New func impl for shuffle bswap.
(expand_vec_perm_const_1): Add shuffle bswap pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/perm-4.c: Adjust checker.
* gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/bswap16-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc   | 117 ++
 .../riscv/rvv/autovec/unop/bswap16-0.c|  17 +++
 .../riscv/rvv/autovec/unop/bswap16-run-0.c|  44 +++
 .../riscv/rvv/autovec/vls/bswap16-0.c |  34 +
 .../gcc.target/riscv/rvv/autovec/vls/perm-4.c |   4 +-
 5 files changed, 214 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/bswap16-0.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 23633a2a74d..3e3b5f2e797 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -878,6 +878,33 @@ emit_vlmax_decompress_insn (rtx target, rtx op0, rtx op1, 
rtx mask)
   emit_vlmax_masked_gather_mu_insn (target, op1, sel, mask);
 }
 
+static void
+emit_vec_sll_scalar (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode)
+{
+  rtx sll_ops[] = {op_0, op_1, op_2};
+  insn_code icode = code_for_pred_scalar (ASHIFT, vec_mode);
+
+  emit_vlmax_insn (icode, BINARY_OP, sll_ops);
+}
+
+static void
+emit_vec_srl_scalar (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode)
+{
+  rtx srl_ops[] = {op_0, op_1, op_2};
+  insn_code icode = code_for_pred_scalar (LSHIFTRT, vec_mode);
+
+  emit_vlmax_insn (icode, BINARY_OP, srl_ops);
+}
+
+static void
+emit_vec_or (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode)
+{
+  rtx or_ops[] = {op_0, op_1, op_2};
+  insn_code icode = code_for_pred (IOR, vec_mode);
+
+  emit_vlmax_insn (icode, BINARY_OP, or_ops);
+}
+
 /* Emit merge instruction.  */
 
 static machine_mode
@@ -3030,6 +3057,94 @@ shuffle_decompress_patterns (struct expand_vec_perm_d *d)
   return true;
 }
 
+static bool
+shuffle_bswap_pattern (struct expand_vec_perm_d *d)
+{
+  HOST_WIDE_INT diff;
+  unsigned i, size, step;
+
+  if (!d->one_vector_p || !d->perm[0].is_constant (&diff) || !diff)
+return false;
+
+  step = diff + 1;
+  size = step * GET_MODE_UNIT_BITSIZE (d->vmode);
+
+  switch (size)
+{
+case 16:
+  break;
+case 32:
+case 64:
+  /* We will have VEC_PERM_EXPR after rtl expand when invoking
+__builtin_bswap. It will generate about 9 instructions in
+loop as below, no matter it is bswap16, bswap32 or bswap64.
+  .L2:
+1 vle16.v v4,0(a0)
+2 vmv.v.x v2,a7
+3 vand.vv v2,v6,v2
+4 sllia2,a5,1
+5 vrgatherei16.vv v1,v4,v2
+6 sub a4,a4,a5
+7 vse16.v v1,0(a3)
+8 add a0,a0,a2
+9 add a3,a3,a2
+  bne a4,zero,.L2
+
+But for bswap16 we may have a even simple code gen, which
+has only 7 instructions in loop as below.
+  .L5
+1 vle8.v  v2,0(a5)
+2 addia5,a5,32
+3 vsrl.vi v4,v2,8
+4 vsll.vi v2,v2,8
+5 vor.vv  v4,v4,v2
+6 vse8.v  v4,0(a4)
+7 addia4,a4,32
+  bne a5,a6,.L5
+
+Unfortunately, the instructions in loop will grow to 13 and 24
+for bswap32 and bswap64. Thus, we will leverage vrgather (9 insn)
+for both the bswap64 and bswap32, but take shift and or (7 insn)
+for bswap16.
+   */
+default:
+  return false;
+}
+
+  for (i = 0; i < step; i++)
+if (!d->p

[PATCH v2] RISC-V: Refine bswap16 auto vectorization code gen

2023-10-09 Thread pan2 . li
From: Pan Li 

Update in v2

* Remove emit helper functions.
* Take expand_binop instead.

Original log:

This patch would like to refine the code gen for the bswap16.

We will have VEC_PERM_EXPR after rtl expand when invoking
__builtin_bswap. It will generate about 9 instructions in
loop as below, no matter it is bswap16, bswap32 or bswap64.

  .L2:
1 vle16.v v4,0(a0)
2 vmv.v.x v2,a7
3 vand.vv v2,v6,v2
4 sllia2,a5,1
5 vrgatherei16.vv v1,v4,v2
6 sub a4,a4,a5
7 vse16.v v1,0(a3)
8 add a0,a0,a2
9 add a3,a3,a2
  bne a4,zero,.L2

But for bswap16 we may have a even simple code gen, which
has only 7 instructions in loop as below.

  .L5
1 vle8.v  v2,0(a5)
2 addia5,a5,32
3 vsrl.vi v4,v2,8
4 vsll.vi v2,v2,8
5 vor.vv  v4,v4,v2
6 vse8.v  v4,0(a4)
7 addia4,a4,32
  bne a5,a6,.L5

Unfortunately, this way will make the insn in loop will grow up to
13 and 24 for bswap32 and bswap64. Thus, we will refine the code
gen for the bswap16 only, and leave both the bswap32 and bswap64
as is.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (shuffle_bswap_pattern): New func impl
for shuffle bswap.
(expand_vec_perm_const_1): Add handling for shuffle bswap pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/perm-4.c: Adjust checker.
* gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/bswap16-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc   | 91 +++
 .../riscv/rvv/autovec/unop/bswap16-0.c| 17 
 .../riscv/rvv/autovec/unop/bswap16-run-0.c| 44 +
 .../riscv/rvv/autovec/vls/bswap16-0.c | 34 +++
 .../gcc.target/riscv/rvv/autovec/vls/perm-4.c |  4 +-
 5 files changed, 188 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/bswap16-0.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 23633a2a74d..c72e411f125 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3030,6 +3030,95 @@ shuffle_decompress_patterns (struct expand_vec_perm_d *d)
   return true;
 }
 
+static bool
+shuffle_bswap_pattern (struct expand_vec_perm_d *d)
+{
+  HOST_WIDE_INT diff;
+  unsigned i, size, step;
+
+  if (!d->one_vector_p || !d->perm[0].is_constant (&diff) || !diff)
+return false;
+
+  step = diff + 1;
+  size = step * GET_MODE_UNIT_BITSIZE (d->vmode);
+
+  switch (size)
+{
+case 16:
+  break;
+case 32:
+case 64:
+  /* We will have VEC_PERM_EXPR after rtl expand when invoking
+__builtin_bswap. It will generate about 9 instructions in
+loop as below, no matter it is bswap16, bswap32 or bswap64.
+  .L2:
+1 vle16.v v4,0(a0)
+2 vmv.v.x v2,a7
+3 vand.vv v2,v6,v2
+4 sllia2,a5,1
+5 vrgatherei16.vv v1,v4,v2
+6 sub a4,a4,a5
+7 vse16.v v1,0(a3)
+8 add a0,a0,a2
+9 add a3,a3,a2
+  bne a4,zero,.L2
+
+But for bswap16 we may have a even simple code gen, which
+has only 7 instructions in loop as below.
+  .L5
+1 vle8.v  v2,0(a5)
+2 addia5,a5,32
+3 vsrl.vi v4,v2,8
+4 vsll.vi v2,v2,8
+5 vor.vv  v4,v4,v2
+6 vse8.v  v4,0(a4)
+7 addia4,a4,32
+  bne a5,a6,.L5
+
+Unfortunately, the instructions in loop will grow to 13 and 24
+for bswap32 and bswap64. Thus, we will leverage vrgather (9 insn)
+for both the bswap64 and bswap32, but take shift and or (7 insn)
+for bswap16.
+   */
+default:
+  return false;
+}
+
+  for (i = 0; i < step; i++)
+if (!d->perm.series_p (i, step, diff - i, step))
+  return false;
+
+  if (d->testing_p)
+return true;
+
+  machine_mode vhi_mode;
+  poly_uint64 vhi_nunits = exact_div (GET_MODE_NUNITS (d->vmode), 2);
+
+  if (!get_vector_mode (HImode, vhi_nunits).exists (&vhi_mode))
+return false;
+
+  /* Step-1: Move op0 to src with VHI mode.  */
+  rtx src = gen_reg_rtx (vhi_mode);
+  emit_move_insn (src, gen_lowpart (vhi_mode, d->op0));
+
+  /* Step-2: Shift right 8 bits to dest.  */
+  rtx dest = expand_binop (vhi_mode, lshr_optab, src, gen_int_mode (8, Pmode),
+  NULL_RTX, 0, OPTAB_DIRECT);
+
+  /* Step-3: Shift left 8 bits to src.  */
+  src = expand_binop (vhi_mode, ashl_optab, src, gen_int_mode (8, Pmode),
+ NULL_RTX, 0, OPTAB_DIRECT);
+
+  /* Step-4: Logic Or dest and src to dest.  */
+  dest = expand_binop (vhi_mode, ior_optab, dest, src,
+  NULL_RTX, 0, OPTAB_DIRECT);
+
+  /* Step-5: Move src to target with VQI mode.  */
+  emit_move

[PATCH v1] RISC-V: Support FP lrint/lrintf auto vectorization

2023-10-11 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP lrint/lrintf auto vectorization.

* long lrint (double) for rv64
* long lrintf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lrintmn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lrint (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lrint (in[i]);
}

Before this patch:
.L3:
  ...
  fld  fa5,0(a1)
  fcvt.l.d a5,fa5,dyn
  sd   a5,-8(a0)
  ...
  bne  a1,a4,.L3

After this patch:
.L3:
  ...
  vsetvli a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli zero,a2,e64,m1,ta,ma
  vse32.v v1,0(a0)
  ...
  bne a2,zero,.L3

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lrint2): New pattern
for lrint/lintf.
* config/riscv/riscv-protos.h (expand_vec_lrint): New func decl
for expanding lint.
* config/riscv/riscv-v.cc (emit_vec_cvt_x_f): New helper func impl
for vfcvt.x.f.v.
(expand_vec_lrint): New function impl for expanding lint.
* config/riscv/vector-iterators.md: New mode attr and iterator.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/test-math.h: New define for
CVT like test case.
* gcc.target/riscv/rvv/autovec/vls/def.h: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lrint-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lrint-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   | 11 +++
 gcc/config/riscv/riscv-protos.h   |  1 +
 gcc/config/riscv/riscv-v.cc   | 20 ++
 gcc/config/riscv/vector-iterators.md  | 69 +++
 .../riscv/rvv/autovec/unop/math-lrint-0.c | 14 
 .../riscv/rvv/autovec/unop/math-lrint-1.c | 14 
 .../riscv/rvv/autovec/unop/math-lrint-run-0.c | 63 +
 .../riscv/rvv/autovec/unop/math-lrint-run-1.c | 63 +
 .../riscv/rvv/autovec/unop/test-math.h| 24 +++
 .../gcc.target/riscv/rvv/autovec/vls/def.h|  9 +++
 .../riscv/rvv/autovec/vls/math-lrint-0.c  | 30 
 .../riscv/rvv/autovec/vls/math-lrint-1.c  | 30 
 12 files changed, 348 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lrint-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lrint-1.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 53e9d34eea1..dc76a01d82c 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2239,6 +2239,7 @@ (define_expand "avg3_ceil"
 ;; - round/roundf
 ;; - trunc/truncf
 ;; - roundeven/roundevenf
+;; - lrint/lrintf
 ;; -
 (define_expand "ceil2"
   [(match_operand:V_VLSF 0 "register_operand")
@@ -2309,3 +2310,13 @@ (define_expand "roundeven2"
 DONE;
   }
 )
+
+(define_expand "lrint2"
+  [(match_operand: 0 "register_operand")
+   (match_operand:V_VLS_FCONVERTL 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 43426a5326b..f6bd15b47b0 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -474,6 +474,7 @@ void expand_vec_rint (rtx, rtx, machine_mode, machine_mode);
 void expand_vec_round (rtx, rtx, machine_mode, machine_mode);
 void expand_vec_trunc (rtx, rtx, machine_mode, machine_mode);
 void expand_vec_roundeven (rtx, rtx, machine_mode, machine_mode);
+void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode);
 #endif
 bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode,
  bool, void (*)(rtx *, rtx));
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index c72e411f125..64f99d85d91 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3911,6 +3911,16 @@ emit_vec_cvt_x_f (rtx op_dest, rtx op_src, rtx mask,
   emit

[PATCH v1] RISC-V: Support FP irintf auto vectorization

2023-10-11 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP irintf auto vectorization.

* int irintf (float)

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lrintmn2 only act on SF => SI.

Given we have code like:

void
test_irintf (int *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_irintf (in[i]);
}

Before this patch:
.L3:
  ...
  flw  fa5,0(a1)
  fcvt.w.s a5,fa5,dyn
  sw   a5,-4(a0)
  ...
  bne  a1,a4,.L3

After this patch:
.L3:
  ...
  vle32.v v1,0(a1)
  vfcvt.x.f.v v1,v1
  vse32.v v1,0(a0)
  ...
  bne a2,zero,.L3

The rest part like DF => SI/HF => SI will be covered by the hook
TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lrint2): Rename from.
(lrint2): Rename to.
* config/riscv/vector-iterators.md: Rename and remove TARGET_64BIT.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-irint-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   |  9 ++-
 gcc/config/riscv/vector-iterators.md  | 74 +--
 .../riscv/rvv/autovec/unop/math-irint-0.c | 14 
 .../riscv/rvv/autovec/unop/math-irint-run-0.c | 63 
 .../riscv/rvv/autovec/vls/math-irint-0.c  | 30 
 5 files changed, 149 insertions(+), 41 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index dc76a01d82c..c3a51e22ceb 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2240,6 +2240,7 @@ (define_expand "avg3_ceil"
 ;; - trunc/truncf
 ;; - roundeven/roundevenf
 ;; - lrint/lrintf
+;; - irintf
 ;; -
 (define_expand "ceil2"
   [(match_operand:V_VLSF 0 "register_operand")
@@ -2311,12 +2312,12 @@ (define_expand "roundeven2"
   }
 )
 
-(define_expand "lrint2"
-  [(match_operand: 0 "register_operand")
-   (match_operand:V_VLS_FCONVERTL 1 "register_operand")]
+(define_expand "lrint2"
+  [(match_operand:0 "register_operand")
+   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index bb0c46ea30a..96ddd34c958 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -3281,8 +3281,8 @@ (define_mode_attr vnnconvert [
   (V512DI "v512hf")
 ])
 
-;; L indicates convert to long
-(define_mode_attr VLCONVERT [
+;; Convert to int, long and long long
+(define_mode_attr V_I_L_LL_CONVERT [
   (RVVM8SF "RVVM8SI") (RVVM4SF "RVVM4SI") (RVVM2SF "RVVM2SI")
   (RVVM1SF "RVVM1SI") (RVVMF2SF "RVVMF2SI")
 
@@ -3298,7 +3298,7 @@ (define_mode_attr VLCONVERT [
   (V512DF "V512DI")
 ])
 
-(define_mode_attr vlconvert [
+(define_mode_attr v_i_l_ll_convert [
   (RVVM8SF "rvvm8si") (RVVM4SF "rvvm4si") (RVVM2SF "rvvm2si")
   (RVVM1SF "rvvm1si") (RVVMF2SF "rvvmf2si")
 
@@ -3314,40 +3314,40 @@ (define_mode_attr vlconvert [
   (V512DF "v512di")
 ])
 
-(define_mode_iterator V_VLS_FCONVERTL [
-  (RVVM8SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM4SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM2SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM1SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN > 
32")
-
-  (RVVM8DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (RVVM4DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (RVVM2DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (RVVM1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-
-  (V1SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V2SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V4SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V8SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V16SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && 
TARGET_MIN_VLEN >= 64")
-  (V32SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && 
TARGET_MIN_VLEN >= 128")
-  (V64SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && 
TARGET_MIN_VLEN >= 256")
-  (V128SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARG

[PATCH v1] RISC-V: Support FP llrint auto vectorization

2023-10-11 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP llrint auto vectorization.

* long long llrint (double)

This will be the CVT from DF => DI from the standard name's perpsective,
which has been covered in previous PATCH(es). Thus, this patch only add
some test cases.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/test-math.h: Add type int64_t.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-llrint-0.c| 14 +
 .../rvv/autovec/unop/math-llrint-run-0.c  | 63 +++
 .../riscv/rvv/autovec/unop/test-math.h|  2 +
 .../riscv/rvv/autovec/vls/math-llrint-0.c | 30 +
 4 files changed, 109 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
new file mode 100644
index 000..2d90d232ba1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "test-math.h"
+
+/*
+** test_double_int64_t___builtin_llrint:
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+*/
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llrint)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
new file mode 100644
index 000..6b69f5568e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
@@ -0,0 +1,63 @@
+/* { dg-do run { target { riscv_v && rv64 } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+double in[ARRAY_SIZE];
+int64_t out[ARRAY_SIZE];
+int64_t ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llrint)
+TEST_ASSERT (int64_t)
+
+TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llrint (1.2), 1)
+TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llrint (-1.2), 2)
+TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llrint (0.5), 3)
+TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llrint (-0.5), 4)
+TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llrint (0.1), 5)
+TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llrint (-0.1), 6)
+TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llrint (3.0), 7)
+TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llrint (-3.0), 8)
+TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llrint 
(4503599627370495.5), 9)
+TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llrint 
(4503599627370497.0), 10)
+TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llrint 
(-4503599627370495.5), 11)
+TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llrint 
(-4503599627370496.0), 12)
+TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llrint (-0.0), 13)
+TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llrint (-0.0), 14)
+TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llrint 
(9223372036854774784.0), 15)
+TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, __builtin_llrint 
(9223372036854775808.0), 16)
+TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llrint 
(-9223372036854775808.0), 17)
+TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, __builtin_llrint 
(-9223372036854777856.0), 18)
+TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llrint 
(__builtin_inf ()), 19)
+TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_llrint 
(-__builtin_inf ()), 20)
+TEST_INIT_CVT (double, __builtin_nan (""), int64_t, 0x7fff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (double, int64_t, 1, __builtin_llrint, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 2, __builtin_llrint, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 3, __builtin_llrint, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 4, __builtin_llrint, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 5, __builtin_llrint, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 6, __builtin_llrint, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 7, __builtin_llrint, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 8, __builtin_

[PATCH v1] RISC-V: Support FP lround/lroundf auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP lround/lroundf auto vectorization.

* long lround (double) for rv64
* long lroundf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lroundmn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lround (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lround (in[i]);
}

Before this patch:
.L3:
  ...
  fld  fa5,0(a1)
  fcvt.l.d a5,fa5,rmm
  sd   a5,-8(a0)
  ...
  bne  a1,a4,.L3

After this patch:
  frrm a6
  ...
  fsrmi4 // RMM
.L3:
  ...
  vsetvli a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli zero,a2,e64,m1,ta,ma
  vse32.v v1,0(a0)
  ...
  bne a2,zero,.L3
  ...
  fsrm a6

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lround2): New
pattern for lround/lroundf.
* config/riscv/riscv-protos.h (enum insn_type): New enum value.
(expand_vec_lround): New func decl for expanding lround.
* config/riscv/riscv-v.cc (expand_vec_lround): New func impl
for expanding lround.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lround-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lround-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   | 10 +++
 gcc/config/riscv/riscv-protos.h   |  2 +
 gcc/config/riscv/riscv-v.cc   | 10 +++
 .../riscv/rvv/autovec/unop/math-lround-0.c| 19 +
 .../riscv/rvv/autovec/unop/math-lround-1.c| 19 +
 .../rvv/autovec/unop/math-lround-run-0.c  | 72 +++
 .../rvv/autovec/unop/math-lround-run-1.c  | 72 +++
 .../riscv/rvv/autovec/vls/math-lround-0.c | 30 
 .../riscv/rvv/autovec/vls/math-lround-1.c | 30 
 9 files changed, 264 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lround-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lround-1.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index ebc51ea69fd..33b11723c21 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2321,3 +2321,13 @@ (define_expand "lrint2"
 DONE;
   }
 )
+
+(define_expand "lround2"
+  [(match_operand:0 "register_operand")
+   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 8c9f7e0ab11..b7eeeb8f55d 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -302,6 +302,7 @@ enum insn_type : unsigned int
   UNARY_OP_TAMA = __MASK_OP_TAMA | UNARY_OP_P,
   UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P,
   UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P,
+  UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P,
   UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P,
   UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P,
   UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P,
@@ -475,6 +476,7 @@ void expand_vec_round (rtx, rtx, machine_mode, 
machine_mode);
 void expand_vec_trunc (rtx, rtx, machine_mode, machine_mode);
 void expand_vec_roundeven (rtx, rtx, machine_mode, machine_mode);
 void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode);
+void expand_vec_lround (rtx, rtx, machine_mode, machine_mode);
 #endif
 bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode,
  bool, void (*)(rtx *, rtx));
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index a75eb59eb43..b61c745678b 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4122,4 +4122,14 @@ expand_vec_lrint (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
   emit_vec_cvt_x_f (op_0, op_1, UNARY_OP_FRM_DYN, vec_fp_mode);
 }
 
+void
+expand_vec_lround (rtx op_0, rtx op_1, machine_mode vec_fp_mode,
+  machine_mode vec_long_mode)
+{
+  gcc_assert (known_eq (GET_MODE_SIZE (vec_fp

[PATCH v1] RISC-V: Support FP lceil/lceilf auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP lceil/lceilf auto vectorization.

* long lceil (double) for rv64
* long lceilf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lceilmn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lceil (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lceil (in[i]);
}

Before this patch:
.L3:
  ...
  fld fa5,0(a1)
  fcvt.l.da5,fa5,rup
  sd  a5,-8(a0)
  ...
  bne a1,a4,.L3

After this patch:
  frrma6
  ...
  fsrmi   3 // RUP
.L3:
  ...
  vsetvli a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli zero,a2,e64,m1,ta,ma
  vse32.v v1,0(a0)
  ...
  bne a2,zero,.L3
  ...
  fsrma6

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lceil2): New
pattern] for lceil/lceilf.
* config/riscv/riscv-protos.h (enum insn_type): New enum value.
(expand_vec_lceil): New func decl for expanding lceil.
* config/riscv/riscv-v.cc (expand_vec_lceil): New func impl
for expanding lceil.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceil-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   | 11 +++
 gcc/config/riscv/riscv-protos.h   |  2 +
 gcc/config/riscv/riscv-v.cc   | 10 +++
 .../riscv/rvv/autovec/unop/math-lceil-0.c | 19 +
 .../riscv/rvv/autovec/unop/math-lceil-1.c | 19 +
 .../riscv/rvv/autovec/unop/math-lceil-run-0.c | 69 +++
 .../riscv/rvv/autovec/unop/math-lceil-run-1.c | 69 +++
 .../riscv/rvv/autovec/vls/math-lceil-0.c  | 30 
 .../riscv/rvv/autovec/vls/math-lceil-1.c  | 30 
 9 files changed, 259 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lceil-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lceil-1.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 33b11723c21..267691a0095 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2241,6 +2241,7 @@ (define_expand "avg3_ceil"
 ;; - roundeven/roundevenf
 ;; - lrint/lrintf
 ;; - irintf
+;; - lceil/lceilf
 ;; -
 (define_expand "ceil2"
   [(match_operand:V_VLSF 0 "register_operand")
@@ -2331,3 +2332,13 @@ (define_expand "lround2"
 DONE;
   }
 )
+
+(define_expand "lceil2"
+  [(match_operand:0 "register_operand")
+   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b7eeeb8f55d..ab65ab19524 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -303,6 +303,7 @@ enum insn_type : unsigned int
   UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P,
   UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P,
   UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P,
+  UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P,
   UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P,
   UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P,
   UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P,
@@ -477,6 +478,7 @@ void expand_vec_trunc (rtx, rtx, machine_mode, 
machine_mode);
 void expand_vec_roundeven (rtx, rtx, machine_mode, machine_mode);
 void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode);
 void expand_vec_lround (rtx, rtx, machine_mode, machine_mode);
+void expand_vec_lceil (rtx, rtx, machine_mode, machine_mode);
 #endif
 bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode,
  bool, void (*)(rtx *, rtx));
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index b61c745678b..b03213dd8ed 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4132,4 +4132,14 @@ expand_vec_lround (rtx op_0, rtx op

[PATCH v1] RISC-V: Support FP lfloor/lfloorf auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP lfloor/lfloorf auto vectorization.

* long lfloor (double) for rv64
* long lfloorf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lfloormn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lfloor (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lfloor (in[i]);
}

Before this patch:
.L3:
  ...
  fld fa5,0(a1)
  fcvt.l.da5,fa5,rdn
  sd  a5,-8(a0)
  ...
  bne a1,a4,.L3

After this patch:
  frrma6
  ...
  fsrmi   2 // RDN
.L3:
  ...
  vsetvli a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli zero,a2,e64,m1,ta,ma
  vse32.v v1,0(a0)
  ...
  bne a2,zero,.L3
  ...
  fsrma6

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lfloor2): New
pattern for lfloor/lfloorf.
* config/riscv/riscv-protos.h (enum insn_type): New enum value.
(expand_vec_lfloor): New func decl for expanding lfloor.
* config/riscv/riscv-v.cc (expand_vec_lfloor): New func impl
for expanding lfloor.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lfloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lfloor-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   | 11 +++
 gcc/config/riscv/riscv-protos.h   |  2 +
 gcc/config/riscv/riscv-v.cc   | 10 +++
 .../riscv/rvv/autovec/unop/math-lfloor-0.c| 19 +
 .../riscv/rvv/autovec/unop/math-lfloor-1.c| 19 +
 .../rvv/autovec/unop/math-lfloor-run-0.c  | 69 +++
 .../rvv/autovec/unop/math-lfloor-run-1.c  | 69 +++
 .../riscv/rvv/autovec/vls/math-lfloor-0.c | 30 
 .../riscv/rvv/autovec/vls/math-lfloor-1.c | 30 
 9 files changed, 259 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lfloor-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lfloor-1.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 267691a0095..c5b1e52cbf9 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2242,6 +2242,7 @@ (define_expand "avg3_ceil"
 ;; - lrint/lrintf
 ;; - irintf
 ;; - lceil/lceilf
+;; - lfloor/lfloorf
 ;; -
 (define_expand "ceil2"
   [(match_operand:V_VLSF 0 "register_operand")
@@ -2342,3 +2343,13 @@ (define_expand "lceil2"
 DONE;
   }
 )
+
+(define_expand "lfloor2"
+  [(match_operand:0 "register_operand")
+   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ab65ab19524..49bdcdf2f93 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -304,6 +304,7 @@ enum insn_type : unsigned int
   UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P,
   UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P,
   UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P,
+  UNARY_OP_FRM_RDN = UNARY_OP | FRM_RDN_P,
   UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P,
   UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P,
   UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P,
@@ -479,6 +480,7 @@ void expand_vec_roundeven (rtx, rtx, machine_mode, 
machine_mode);
 void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode);
 void expand_vec_lround (rtx, rtx, machine_mode, machine_mode);
 void expand_vec_lceil (rtx, rtx, machine_mode, machine_mode);
+void expand_vec_lfloor (rtx, rtx, machine_mode, machine_mode);
 #endif
 bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode,
  bool, void (*)(rtx *, rtx));
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index b03213dd8ed..21d86c3f917 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4142,4 +4142,14 @@ expand_vec_lceil (

[PATCH v1] RISC-V: Leverage stdint-gcc.h for RVV test cases

2023-10-12 Thread pan2 . li
From: Pan Li 

Leverage stdint-gcc.h for the int64_t types instead of typedef.
Or we may have conflict with stdint-gcc.h in somewhere else.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: Include
stdint-gcc.h for int types.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/test-math.h: Remove int64_t
typedef.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c | 1 +
 .../gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c   | 1 +
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h | 2 --
 3 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
index 2d90d232ba1..4bf125f8cc8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
@@ -2,6 +2,7 @@
 /* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
 /* { dg-final { check-function-bodies "**" "" } } */
 
+#include 
 #include "test-math.h"
 
 /*
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
index 6b69f5568e9..409175a8dff 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
@@ -1,6 +1,7 @@
 /* { dg-do run { target { riscv_v && rv64 } } } */
 /* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
 
+#include 
 #include "test-math.h"
 
 #define ARRAY_SIZE 128
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h
index 3867bc50a14..a1c9d55bd48 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h
@@ -68,8 +68,6 @@
 #define FRM_RMM 4
 #define FRM_DYN 7
 
-typedef long long int64_t;
-
 static inline void
 set_rm (unsigned rm)
 {
-- 
2.34.1



[PATCH v1] RISC-V: Add test for FP iroundf auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li 

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

int iroundf (float);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-iround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iround-0.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-iround-0.c| 19 ++
 .../rvv/autovec/unop/math-iround-run-0.c  | 63 +++
 .../riscv/rvv/autovec/vls/math-iround-0.c | 30 +
 3 files changed, 112 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-iround-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c
new file mode 100644
index 000..f32515d1403
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "test-math.h"
+
+/*
+** test_float_int___builtin_iroundf:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+4
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (float, int, __builtin_iroundf)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c
new file mode 100644
index 000..2e05e443afe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c
@@ -0,0 +1,63 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+float in[ARRAY_SIZE];
+int out[ARRAY_SIZE];
+int ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (float, int, __builtin_iroundf)
+TEST_ASSERT (int)
+
+TEST_INIT_CVT (float, 1.2, int, __builtin_iroundf (1.2), 1)
+TEST_INIT_CVT (float, -1.2, int, __builtin_iroundf (-1.2), 2)
+TEST_INIT_CVT (float, 0.5, int, __builtin_iroundf (0.5), 3)
+TEST_INIT_CVT (float, -0.5, int, __builtin_iroundf (-0.5), 4)
+TEST_INIT_CVT (float, 0.1, int, __builtin_iroundf (0.1), 5)
+TEST_INIT_CVT (float, -0.1, int, __builtin_iroundf (-0.1), 6)
+TEST_INIT_CVT (float, 3.0, int, __builtin_iroundf (3.0), 7)
+TEST_INIT_CVT (float, -3.0, int, __builtin_iroundf (-3.0), 8)
+TEST_INIT_CVT (float, 8388607.5, int, __builtin_iroundf (8388607.5), 9)
+TEST_INIT_CVT (float, 8388609.0, int, __builtin_iroundf (8388609.0), 10)
+TEST_INIT_CVT (float, -8388607.5, int, __builtin_iroundf (-8388607.5), 11)
+TEST_INIT_CVT (float, -8388609.0, int, __builtin_iroundf (-8388609.0), 12)
+TEST_INIT_CVT (float, 0.0, int, __builtin_iroundf (-0.0), 13)
+TEST_INIT_CVT (float, -0.0, int, __builtin_iroundf (-0.0), 14)
+TEST_INIT_CVT (float, 2147483520.0, int, __builtin_iroundf (2147483520.0), 15)
+TEST_INIT_CVT (float, 2147483648.0, int, 0x7fff, 16)
+TEST_INIT_CVT (float, -2147483648.0, int, __builtin_iroundf (-2147483648.0), 
17)
+TEST_INIT_CVT (float, -2147483904.0, int, 0x8000, 18)
+TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_iroundf (__builtin_inff 
()), 19)
+TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_iroundf 
(-__builtin_inff ()), 20)
+TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (float, int, 1, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 2, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 3, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 4, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 5, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 6, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 7, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 8, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 9, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 10, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 11, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 12, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 13, __builtin_iroundf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 1

[PATCH v1] RISC-V: Add test for FP llround auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li 

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llround (double);

This patch would like to add the test cases for ensuring the correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llround-0.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-llround-0.c   | 20 ++
 .../rvv/autovec/unop/math-llround-run-0.c | 64 +++
 .../riscv/rvv/autovec/vls/math-llround-0.c| 30 +
 3 files changed, 114 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llround-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c
new file mode 100644
index 000..4f8b4553a91
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include 
+#include "test-math.h"
+
+/*
+** test_double_int64_t___builtin_llround:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+4
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llround)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c
new file mode 100644
index 000..c5b60847cc7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c
@@ -0,0 +1,64 @@
+/* { dg-do run { target { riscv_v && rv64 } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include 
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+double in[ARRAY_SIZE];
+int64_t out[ARRAY_SIZE];
+int64_t ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llround)
+TEST_ASSERT (int64_t)
+
+TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llround (1.2), 1)
+TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llround (-1.2), 2)
+TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llround (0.5), 3)
+TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llround (-0.5), 4)
+TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llround (0.1), 5)
+TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llround (-0.1), 6)
+TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llround (3.0), 7)
+TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llround (-3.0), 8)
+TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llround 
(4503599627370495.5), 9)
+TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llround 
(4503599627370497.0), 10)
+TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llround 
(-4503599627370495.5), 11)
+TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llround 
(-4503599627370496.0), 12)
+TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llround (-0.0), 13)
+TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llround (-0.0), 14)
+TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llround 
(9223372036854774784.0), 15)
+TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, 0x7fff, 16)
+TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llround 
(-9223372036854775808.0), 17)
+TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, 0x8000, 18)
+TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llround 
(__builtin_inf ()), 19)
+TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_llround 
(-__builtin_inf ()), 20)
+TEST_INIT_CVT (double, __builtin_nan (""), int64_t, 0x7fff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (double, int64_t, 1, __builtin_llround, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 2, __builtin_llround, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 3, __builtin_llround, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 4, __builtin_llround, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 5, __builtin_llround, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 6, __builtin_llround, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 7, __builtin_llround, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 8, __builtin_llround, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 9, __bui

[PATCH v1] RISC-V: Add test for FP llceil auto vectorization

2023-10-13 Thread pan2 . li
From: Pan Li 

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llceil (double);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llceil-0.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-llceil-0.c| 20 ++
 .../rvv/autovec/unop/math-llceil-run-0.c  | 64 +++
 .../riscv/rvv/autovec/vls/math-llceil-0.c | 30 +
 3 files changed, 114 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llceil-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c
new file mode 100644
index 000..3480c3ea91d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include 
+#include "test-math.h"
+
+/*
+** test_double_int64_t___builtin_llceil:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+3
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llceil)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c
new file mode 100644
index 000..5ccbe64ffb5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c
@@ -0,0 +1,64 @@
+/* { dg-do run { target { riscv_v && rv64 } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include 
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+double in[ARRAY_SIZE];
+int64_t out[ARRAY_SIZE];
+int64_t ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llceil)
+TEST_ASSERT (int64_t)
+
+TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llceil (1.2), 1)
+TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llceil (-1.2), 2)
+TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llceil (0.5), 3)
+TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llceil (-0.5), 4)
+TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llceil (0.1), 5)
+TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llceil (-0.1), 6)
+TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llceil (3.0), 7)
+TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llceil (-3.0), 8)
+TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llceil 
(4503599627370495.5), 9)
+TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llceil 
(4503599627370497.0), 10)
+TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llceil 
(-4503599627370495.5), 11)
+TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llceil 
(-4503599627370496.0), 12)
+TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llceil (-0.0), 13)
+TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llceil (-0.0), 14)
+TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llceil 
(9223372036854774784.0), 15)
+TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, 0x7fff, 16)
+TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llceil 
(-9223372036854775808.0), 17)
+TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, 0x8000, 18)
+TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llceil 
(__builtin_inf ()), 19)
+TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_llceil 
(-__builtin_inf ()), 20)
+TEST_INIT_CVT (double, __builtin_nan (""), int64_t, 0x7fff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (double, int64_t, 1, __builtin_llceil, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 2, __builtin_llceil, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 3, __builtin_llceil, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 4, __builtin_llceil, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 5, __builtin_llceil, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 6, __builtin_llceil, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 7, __builtin_llceil, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 8, __builtin_llceil, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 9, __builtin_llceil, in, out, ref, 
ARRAY_SIZE);
+

[PATCH v1] RISC-V: Add test for FP iceil auto vectorization

2023-10-13 Thread pan2 . li
From: Pan Li 

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

int iceil (float);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iceil-0.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-iceil-0.c | 19 ++
 .../riscv/rvv/autovec/unop/math-iceil-run-0.c | 63 +++
 .../riscv/rvv/autovec/vls/math-iceil-0.c  | 30 +
 3 files changed, 112 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-iceil-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c
new file mode 100644
index 000..2d4a1d163d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "test-math.h"
+
+/*
+** test_float_int___builtin_iceilf:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+3
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (float, int, __builtin_iceilf)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c
new file mode 100644
index 000..714173a7f8b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c
@@ -0,0 +1,63 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+float in[ARRAY_SIZE];
+int out[ARRAY_SIZE];
+int ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (float, int, __builtin_iceilf)
+TEST_ASSERT (int)
+
+TEST_INIT_CVT (float, 1.2, int, __builtin_iceilf (1.2), 1)
+TEST_INIT_CVT (float, -1.2, int, __builtin_iceilf (-1.2), 2)
+TEST_INIT_CVT (float, 0.5, int, __builtin_iceilf (0.5), 3)
+TEST_INIT_CVT (float, -0.5, int, __builtin_iceilf (-0.5), 4)
+TEST_INIT_CVT (float, 0.1, int, __builtin_iceilf (0.1), 5)
+TEST_INIT_CVT (float, -0.1, int, __builtin_iceilf (-0.1), 6)
+TEST_INIT_CVT (float, 3.0, int, __builtin_iceilf (3.0), 7)
+TEST_INIT_CVT (float, -3.0, int, __builtin_iceilf (-3.0), 8)
+TEST_INIT_CVT (float, 8388607.5, int, __builtin_iceilf (8388607.5), 9)
+TEST_INIT_CVT (float, 8388609.0, int, __builtin_iceilf (8388609.0), 10)
+TEST_INIT_CVT (float, -8388607.5, int, __builtin_iceilf (-8388607.5), 11)
+TEST_INIT_CVT (float, -8388609.0, int, __builtin_iceilf (-8388609.0), 12)
+TEST_INIT_CVT (float, 0.0, int, __builtin_iceilf (-0.0), 13)
+TEST_INIT_CVT (float, -0.0, int, __builtin_iceilf (-0.0), 14)
+TEST_INIT_CVT (float, 2147483520.0, int, __builtin_iceilf (2147483520.0), 15)
+TEST_INIT_CVT (float, 2147483648.0, int, 0x7fff, 16)
+TEST_INIT_CVT (float, -2147483648.0, int, __builtin_iceilf (-2147483648.0), 17)
+TEST_INIT_CVT (float, -2147483904.0, int, 0x8000, 18)
+TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_iceilf (__builtin_inff 
()), 19)
+TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_iceilf 
(-__builtin_inff ()), 20)
+TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (float, int, 1, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 2, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 3, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 4, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 5, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 6, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 7, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 8, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 9, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 10, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 11, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 12, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 13, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 14, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+ 

[PATCH v1] RISC-V: Add test for FP ifloor auto vectorization

2023-10-13 Thread pan2 . li
From: Pan Li 

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

int ifloor (float);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-ifloor-0.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-ifloor-0.c| 19 ++
 .../rvv/autovec/unop/math-ifloor-run-0.c  | 63 +++
 .../riscv/rvv/autovec/vls/math-ifloor-0.c | 30 +
 3 files changed, 112 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-ifloor-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c
new file mode 100644
index 000..b9ec415d690
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "test-math.h"
+
+/*
+** test_float_int___builtin_ifloorf:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+2
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (float, int, __builtin_ifloorf)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c
new file mode 100644
index 000..8ef4da0ea88
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c
@@ -0,0 +1,63 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+float in[ARRAY_SIZE];
+int out[ARRAY_SIZE];
+int ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (float, int, __builtin_ifloorf)
+TEST_ASSERT (int)
+
+TEST_INIT_CVT (float, 1.2, int, __builtin_ifloorf (1.2), 1)
+TEST_INIT_CVT (float, -1.2, int, __builtin_ifloorf (-1.2), 2)
+TEST_INIT_CVT (float, 0.5, int, __builtin_ifloorf (0.5), 3)
+TEST_INIT_CVT (float, -0.5, int, __builtin_ifloorf (-0.5), 4)
+TEST_INIT_CVT (float, 0.1, int, __builtin_ifloorf (0.1), 5)
+TEST_INIT_CVT (float, -0.1, int, __builtin_ifloorf (-0.1), 6)
+TEST_INIT_CVT (float, 3.0, int, __builtin_ifloorf (3.0), 7)
+TEST_INIT_CVT (float, -3.0, int, __builtin_ifloorf (-3.0), 8)
+TEST_INIT_CVT (float, 8388607.5, int, __builtin_ifloorf (8388607.5), 9)
+TEST_INIT_CVT (float, 8388609.0, int, __builtin_ifloorf (8388609.0), 10)
+TEST_INIT_CVT (float, -8388607.5, int, __builtin_ifloorf (-8388607.5), 11)
+TEST_INIT_CVT (float, -8388609.0, int, __builtin_ifloorf (-8388609.0), 12)
+TEST_INIT_CVT (float, 0.0, int, __builtin_ifloorf (-0.0), 13)
+TEST_INIT_CVT (float, -0.0, int, __builtin_ifloorf (-0.0), 14)
+TEST_INIT_CVT (float, 2147483520.0, int, __builtin_ifloorf (2147483520.0), 15)
+TEST_INIT_CVT (float, 2147483648.0, int, 0x7fff, 16)
+TEST_INIT_CVT (float, -2147483648.0, int, __builtin_ifloorf (-2147483648.0), 
17)
+TEST_INIT_CVT (float, -2147483904.0, int, 0x8000, 18)
+TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_ifloorf (__builtin_inff 
()), 19)
+TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_ifloorf 
(-__builtin_inff ()), 20)
+TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (float, int, 1, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 2, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 3, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 4, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 5, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 6, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 7, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 8, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 9, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 10, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 11, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 12, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 13, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 14

[PATCH v1] RISC-V: Add test for FP llfloor auto vectorization

2023-10-13 Thread pan2 . li
From: Pan Li 

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llfloor (double);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llfloor-0.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-llfloor-0.c   | 20 ++
 .../rvv/autovec/unop/math-llfloor-run-0.c | 64 +++
 .../riscv/rvv/autovec/vls/math-llfloor-0.c| 30 +
 3 files changed, 114 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llfloor-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c
new file mode 100644
index 000..4b10f966015
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include 
+#include "test-math.h"
+
+/*
+** test_double_int64_t___builtin_llfloor:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+2
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llfloor)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c
new file mode 100644
index 000..22829132e96
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c
@@ -0,0 +1,64 @@
+/* { dg-do run { target { riscv_v && rv64 } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include 
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+double in[ARRAY_SIZE];
+int64_t out[ARRAY_SIZE];
+int64_t ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llfloor)
+TEST_ASSERT (int64_t)
+
+TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llfloor (1.2), 1)
+TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llfloor (-1.2), 2)
+TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llfloor (0.5), 3)
+TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llfloor (-0.5), 4)
+TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llfloor (0.1), 5)
+TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llfloor (-0.1), 6)
+TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llfloor (3.0), 7)
+TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llfloor (-3.0), 8)
+TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llfloor 
(4503599627370495.5), 9)
+TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llfloor 
(4503599627370497.0), 10)
+TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llfloor 
(-4503599627370495.5), 11)
+TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llfloor 
(-4503599627370496.0), 12)
+TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llfloor (-0.0), 13)
+TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llfloor (-0.0), 14)
+TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llfloor 
(9223372036854774784.0), 15)
+TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, 0x7fff, 16)
+TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llfloor 
(-9223372036854775808.0), 17)
+TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, 0x8000, 18)
+TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llfloor 
(__builtin_inf ()), 19)
+TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_llfloor 
(-__builtin_inf ()), 20)
+TEST_INIT_CVT (double, __builtin_nan (""), int64_t, 0x7fff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (double, int64_t, 1, __builtin_llfloor, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 2, __builtin_llfloor, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 3, __builtin_llfloor, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 4, __builtin_llfloor, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 5, __builtin_llfloor, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 6, __builtin_llfloor, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 7, __builtin_llfloor, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 8, __builtin_llfloor, in, out, ref, 
ARRAY_SIZE);
+  RUN_TEST_CVT (double, int64_t, 9, __bui

[PATCH v1] RISC-V: Refine run test cases of math autovec

2023-10-13 Thread pan2 . li
From: Pan Li 

For the run test cases of math autovec, we need a reference value to
check if the return value is expected or not.

The previous patch leverage hardcode for the reference value but we
can leverage the scalar math function instead. For example ceil after
autovec.

ASSERT (CEIL (Vector {1.2,...}) == Vector {2.0, ...});

But we can leverage the scalar math function to avoid potential mistakes.

ASSERT (CEIL (Vector {1.2,...}) == Vector {ceil (1.2), ...});

This patch remove some fflags check as it covered by check-body already.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c:
Use scalar func as reference instead of hardcode.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-floor-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-floor-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-rint-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-rint-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-round-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-round-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-run-2.c: Ditto.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-ceil-run-1.c  | 18 +-
 .../riscv/rvv/autovec/unop/math-ceil-run-2.c  | 18 +-
 .../riscv/rvv/autovec/unop/math-floor-run-1.c | 18 +-
 .../riscv/rvv/autovec/unop/math-floor-run-2.c | 18 +-
 .../rvv/autovec/unop/math-nearbyint-run-1.c   | 33 ++-
 .../rvv/autovec/unop/math-nearbyint-run-2.c   | 33 ++-
 .../riscv/rvv/autovec/unop/math-rint-run-1.c  | 33 ++-
 .../riscv/rvv/autovec/unop/math-rint-run-2.c  | 33 ++-
 .../riscv/rvv/autovec/unop/math-round-run-1.c | 18 +-
 .../riscv/rvv/autovec/unop/math-round-run-2.c | 18 +-
 .../riscv/rvv/autovec/unop/math-trunc-run-1.c | 18 +-
 .../riscv/rvv/autovec/unop/math-trunc-run-2.c | 18 +-
 12 files changed, 140 insertions(+), 136 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c
index 88611e8268e..419a3def4df 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c
@@ -12,15 +12,15 @@ float ref[ARRAY_SIZE];
 TEST_UNARY_CALL (float, __builtin_ceilf)
 TEST_ASSERT (float)
 
-TEST_INIT (float, 1.2, 2.0, 1)
-TEST_INIT (float, -1.2, -1.0, 2)
-TEST_INIT (float, 3.0, 3.0, 3)
-TEST_INIT (float, 8388607.5, 8388608.0, 4)
-TEST_INIT (float, 8388609.0, 8388609.0, 5)
-TEST_INIT (float, 0.0, 0.0, 6)
-TEST_INIT (float, -0.0, -0.0, 7)
-TEST_INIT (float, -8388607.5, -8388607.0, 8)
-TEST_INIT (float, -8388608.0, -8388608.0, 9)
+TEST_INIT (float, 1.2, __builtin_ceilf (1.2), 1)
+TEST_INIT (float, -1.2, __builtin_ceilf (-1.2), 2)
+TEST_INIT (float, 3.0, __builtin_ceilf (3.0), 3)
+TEST_INIT (float, 8388607.5, __builtin_ceilf (8388607.5), 4)
+TEST_INIT (float, 8388609.0, __builtin_ceilf (8388609.0), 5)
+TEST_INIT (float, 0.0, __builtin_ceilf (0.0), 6)
+TEST_INIT (float, -0.0,__builtin_ceilf (-0.0), 7)
+TEST_INIT (float, -8388607.5, __builtin_ceilf (-8388607.5), 8)
+TEST_INIT (float, -8388608.0, __builtin_ceilf (-8388608.0), 9)
 
 int
 main ()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c
index bb4c86c3d12..2b29c8e4414 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c
@@ -12,15 +12,15 @@ double ref[ARRAY_SIZE];
 TEST_UNARY_CALL (double, __builtin_ceil)
 TEST_ASSERT (double)
 
-TEST_INIT (double, 1.2, 2.0, 1)
-TEST_INIT (double, -1.2, -1.0, 2)
-TEST_INIT (double, 3.0, 3.0, 3)
-TEST_INIT (double, 4503599627370495.5, 4503599627370496.0, 4)
-TEST_INIT (double, 4503599627370497.0, 4503599627370497.0, 5)
-TEST_INIT (double, 0.0, 0.0, 6)
-TEST_INIT (double, -0.0, -0.0, 7)
-TEST_INIT (double, -4503599627370495.5, -4503599627370495.0, 8)
-TEST_INIT (double, -4503599627370496.0, -4503599627370496.0, 9)
+TEST_INIT (double, 1.2, __builtin_ceil (1.2), 1)
+TEST_INIT (double, -1.2, __builtin_ceil (-1.2), 2)
+TEST_INIT (double, 3.0, __builtin_ceil (3.0), 3)
+TEST_INIT (double, 4503599627370495.5, __builtin_ceil (4503599627370495.5), 4)
+TEST_INIT (double, 4503599627370497.0, __builtin_ceil (4503599627370497.0), 5)
+TEST_INIT (double, 0.0, __builtin_ceil (0.0), 6)
+TEST_INIT (double, -0.0, __builtin_ceil (-0.0), 7)
+TEST_INIT (double, -4503599627370495.5, __builtin_ceil (-450

[PATCH v1] RISC-V: Remove the type size restriction of vectorizer

2023-10-17 Thread pan2 . li
From: Pan Li 

The vectoriable_call has one restriction of the size of data type.
Aka DF to DI is allowed but SF to DI isn't. You may see below message
when try to vectorize function call like lrintf.

void
test_lrintf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lrintf (in[i]);
}

lrintf.c:5:26: missed: couldn't vectorize loop
lrintf.c:5:26: missed: not vectorized: unsupported data-type

Then the standard name pattern like lrintmn2 cannot work for different
data type size like SF => DI. This patch would like to remove this data
type size check and unblock the standard name like lrintmn2.

Passed the x86 bootstrap and regression test already.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_call): Remove data size
check.

Signed-off-by: Pan Li 
---
 gcc/tree-vect-stmts.cc | 13 -
 1 file changed, 13 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index b3a56498595..326e000a71d 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -3529,19 +3529,6 @@ vectorizable_call (vec_info *vinfo,
 
   return false;
 }
-  /* FORNOW: we don't yet support mixtures of vector sizes for calls,
- just mixtures of nunits.  E.g. DI->SI versions of __builtin_ctz*
- are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed
- by a pack of the two vectors into an SI vector.  We would need
- separate code to handle direct VnDI->VnSI IFN_CTZs.  */
-  if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out))
-{
-  if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"mismatched vector sizes %T and %T\n",
-vectype_in, vectype_out);
-  return false;
-}
 
   if (VECTOR_BOOLEAN_TYPE_P (vectype_out)
   != VECTOR_BOOLEAN_TYPE_P (vectype_in))
-- 
2.34.1



[PATCH v1] RISC-V: Bugfix for merging undefined tmp register in math

2023-10-22 Thread pan2 . li
From: Pan Li 

For math function autovec, there will be one step like

rtx tmp = gen_reg_rtx (vec_int_mode);
emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);

The MU will leave the tmp (aka dest register) register unmasked elements
unchanged and it is undefined here. This patch would like to adjust the
MU to MA.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum insn_type): Add new type
values.
* config/riscv/riscv-v.cc (emit_vec_cvt_x_f): Add undef merge
operand handling.
(expand_vec_ceil): Take MA instead of MU for tmp register.
(expand_vec_floor): Ditto.
(expand_vec_nearbyint): Ditto.
(expand_vec_rint): Ditto.
(expand_vec_round): Ditto.
(expand_vec_roundeven): Ditto.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-protos.h |  5 +
 gcc/config/riscv/riscv-v.cc | 24 
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index f7a9a02f1f9..5dc97c2adc0 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -306,6 +306,11 @@ enum insn_type : unsigned int
   UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P,
   UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P,
   UNARY_OP_FRM_RDN = UNARY_OP | FRM_RDN_P,
+  UNARY_OP_TAMA_FRM_DYN = UNARY_OP_TAMA | FRM_DYN_P,
+  UNARY_OP_TAMA_FRM_RUP = UNARY_OP_TAMA | FRM_RUP_P,
+  UNARY_OP_TAMA_FRM_RDN = UNARY_OP_TAMA | FRM_RDN_P,
+  UNARY_OP_TAMA_FRM_RMM = UNARY_OP_TAMA | FRM_RMM_P,
+  UNARY_OP_TAMA_FRM_RNE = UNARY_OP_TAMA | FRM_RNE_P,
   UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P,
   UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P,
   UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P,
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 383af55fe3a..91ad6a61fa8 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4108,10 +4108,18 @@ static void
 emit_vec_cvt_x_f (rtx op_dest, rtx op_src, rtx mask,
  insn_type type, machine_mode vec_mode)
 {
-  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
   insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, vec_mode);
 
-  emit_vlmax_insn (icode, type, cvt_x_ops);
+  if (type & USE_VUNDEF_MERGE_P)
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
+  else
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
 }
 
 static void
@@ -4157,7 +4165,7 @@ expand_vec_ceil (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
 
   /* Step-3: Convert to integer on mask, with rounding up (aka ceil).  */
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RUP, vec_fp_mode);
+  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_RUP, vec_fp_mode);
 
   /* Step-4: Convert to floating-point on mask for the final result.
  To avoid unnecessary frm register access, we use RUP here and it will
@@ -4182,7 +4190,7 @@ expand_vec_floor (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
 
   /* Step-3: Convert to integer on mask, with rounding down (aka floor).  */
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RDN, vec_fp_mode);
+  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_RDN, vec_fp_mode);
 
   /* Step-4: Convert to floating-point on mask for the floor result.  */
   emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_RDN, vec_fp_mode);
@@ -4208,7 +4216,7 @@ expand_vec_nearbyint (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
 
   /* Step-4: Convert to integer on mask, with rounding down (aka nearbyint).  
*/
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);
+  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_DYN, vec_fp_mode);
 
   /* Step-5: Convert to floating-point on mask for the nearbyint result.  */
   emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);
@@ -4233,7 +4241,7 @@ expand_vec_rint (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
 
   /* Step-3: Convert to integer on mask, with dyn rounding (aka rint).  */
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);
+  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_DYN, vec_fp_mode);
 
   /* Step-4: Convert to floating-point on mask for the rint result.  */
   emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);
@@ -4255,7 +4263,7 @@ expand_vec_round (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
 
   /* Step-3: Convert to integer on mask, rounding to nearest (aka round).  */
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RMM, vec_fp_mode);
+  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_RMM, vec_fp_mode);
 
   /* Step-4: Convert to fl

[PATCH v1] RISC-V: Remove unnecessary asm check for rounding autovec

2023-10-22 Thread pan2 . li
From: Pan Li 

The vsetvl asm check is unnecessary for the rounding function autovec.
These rounding test cases should focus on the rounding insn sequence.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: Remove the
vsetvl check.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-floor-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-floor-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-floor-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-floor-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-iround-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-llround-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lround-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lround-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-rint-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-rint-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-rint-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-rint-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-round-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-round-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-round-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-round-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-3.c: Ditto.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-0.c  | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c| 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c| 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c| 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c| 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-0.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-1.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-2.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-floor-3.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c  | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-0.c  | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c  | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c  | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c  | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c  | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c   | 1 -
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrou

[PATCH v1] RISC-V: Remove unnecessary asm check for binop constraint

2023-10-22 Thread pan2 . li
From: Pan Li 

The vsetvl asm check is unnecessary for the binop constraint. We
should be focus for constrait and leave the vsetvl test to the
vsetvl pass.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/binop_vv_constraint-1.c: Remove the
vsetvl asm check from func body.
* gcc.target/riscv/rvv/base/binop_vx_constraint-1.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-10.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-11.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-12.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-129.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-13.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-130.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-131.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-133.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-134.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-135.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-14.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-15.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-153.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-154.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-155.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-158.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-16.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-17.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-171.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-172.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-173.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-174.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-18.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-19.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-2.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-20.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-21.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-22.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-23.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-24.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-25.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-26.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-27.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-28.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-29.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-3.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-30.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-31.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-32.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-33.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-34.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-35.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-36.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-37.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-38.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-39.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-4.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-40.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-41.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-42.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-43.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-44.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-5.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-6.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-7.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-8.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-9.c: Ditto.
* gcc.target/riscv/rvv/base/shift_vx_constraint-1.c: Ditto.
* gcc.target/riscv/rvv/base/ternop_vv_constraint-1.c: Ditto.
* gcc.target/riscv/rvv/base/ternop_vv_constraint-2.c: Ditto.
* gcc.target/riscv/rvv/base/ternop_vv_constraint-3.c: Ditto.
* gcc.target/riscv/rvv/base/ternop_vv_constraint-4.c: Ditto.
* gcc.target/riscv/rvv/base/ternop_vv_constraint-5.c: Ditto.
* gcc.target/riscv/rvv/base/ternop_vv_constraint-6.c: Ditto.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-1.c: Ditto.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-8.c: Ditto.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-9.c: Ditto.
   

[PATCH v1] RISC-V: Bugfix for merging undef tmp register for trunc

2023-10-23 Thread pan2 . li
From: Pan Li 

For trunc function autovec, there will be one step like below take MU
for the merge operand.

rtx tmp = gen_reg_rtx (vec_int_mode);
emit_vec_cvt_x_f_rtz (tmp, op_1, mask, vec_fp_mode);

The MU will leave the tmp (aka dest register) register unmasked elements
unchanged and it is undefined here. This patch would like to adjust the
MU to MA.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vec_cvt_x_f_rtz): Add insn type
arg.
(expand_vec_trunc): Take MA instead of MU for cvt_x_f_rtz.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 91ad6a61fa8..fb6a4e561db 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4144,12 +4144,20 @@ emit_vec_cvt_f_x (rtx op_dest, rtx op_src, rtx mask,
 
 static void
 emit_vec_cvt_x_f_rtz (rtx op_dest, rtx op_src, rtx mask,
- machine_mode vec_mode)
+ insn_type type, machine_mode vec_mode)
 {
-  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
   insn_code icode = code_for_pred (FIX, vec_mode);
 
-  emit_vlmax_insn (icode, UNARY_OP_TAMU, cvt_x_ops);
+  if (type & USE_VUNDEF_MERGE_P)
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
+  else
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
 }
 
 void
@@ -4285,7 +4293,7 @@ expand_vec_trunc (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
 
   /* Step-3: Convert to integer on mask, rounding to zero (aka truncate).  */
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f_rtz (tmp, op_1, mask, vec_fp_mode);
+  emit_vec_cvt_x_f_rtz (tmp, op_1, mask, UNARY_OP_TAMA, vec_fp_mode);
 
   /* Step-4: Convert to floating-point on mask for the rint result.  */
   emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);
-- 
2.34.1



[PATCH v1] RISC-V: Remove unnecessary asm check for vec cvt

2023-10-23 Thread pan2 . li
From: Pan Li 

The vsetvl asm check is unnecessary for the vector convert. We
should be focus for constrait and leave the vsetvl test to the
vsetvl pass.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/cvt-0.c: Remove the vsetvl
asm check from func body.
* gcc.target/riscv/rvv/autovec/unop/cvt-1.c: Ditto.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c | 3 +--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c
index 762b1408994..7d66ed3e943 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c
@@ -7,9 +7,8 @@
 /*
 ** test_int65_to_fp16:
 **   ...
-**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*mf2,\s*ta,\s*ma
 **   vfncvt\.f\.x\.w\s+v[0-9]+,\s*v[0-9]+
-**   vsetvli\s+zero,\s*zero,\s*e16,\s*mf4,\s*ta,\s*ma
+**   ...
 **   vfncvt\.f\.f\.w\s+v[0-9]+,\s*v[0-9]+
 **   ...
 */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c
index 3180ba3612c..af08c51ef8b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c
@@ -7,9 +7,8 @@
 /*
 ** test_uint65_to_fp16:
 **   ...
-**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*mf2,\s*ta,\s*ma
 **   vfncvt\.f\.xu\.w\s+v[0-9]+,\s*v[0-9]+
-**   vsetvli\s+zero,\s*zero,\s*e16,\s*mf4,\s*ta,\s*ma
+**   ...
 **   vfncvt\.f\.f\.w\s+v[0-9]+,\s*v[0-9]+
 **   ...
 */
-- 
2.34.1



[PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-25 Thread pan2 . li
From: Pan Li 

Update in v2:

* Fix one ICE of type assertion.
* Adjust some test cases for aarch64 sve and riscv vector.

Original log:

The vectoriable_call has one restriction of the size of data type.
Aka DF to DI is allowed but SF to DI isn't. You may see below message
when try to vectorize function call like lrintf.

void
test_lrintf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lrintf (in[i]);
}

lrintf.c:5:26: missed: couldn't vectorize loop
lrintf.c:5:26: missed: not vectorized: unsupported data-type

Then the standard name pattern like lrintmn2 cannot work for different
data type size like SF => DI. This patch would like to remove this data
type size check and unblock the standard name like lrintmn2.

The below test are passed for this patch.

* The x86 bootstrap and regression test.
* The aarch64 regression test.
* The risc-v regression tests.

gcc/ChangeLog:

* internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
* tree-vect-stmts.cc (vectorizable_call): Remove size check.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
* gcc.target/aarch64/sve/clz_1.c: Ditto.
* gcc.target/aarch64/sve/popcount_1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.

Signed-off-by: Pan Li 
---
 gcc/internal-fn.cc  |  3 ++-
 gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
 gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
 gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
 .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
 gcc/tree-vect-stmts.cc  | 13 -
 6 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 61d5a9e4772..17c0f4c3805 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
unsigned int noutputs,
emit_move_insn (lhs_rtx, ops[0].value);
   else
{
- gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
+ gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+  || VECTOR_INTEGER_TYPE_P (TREE_TYPE (lhs)));
  convert_move (lhs_rtx, ops[0].value, 0);
}
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
index bdc9856faaf..940d08bbc7b 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
@@ -18,5 +18,4 @@ clrsb_64 (unsigned int *restrict dst, uint64_t *restrict src, 
int size)
 }
 
 /* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.s, p[0-7]/m, 
z[0-9]+\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, 
z[0-9]+\.d\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, 
z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, 
z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
index 0c7a4e6d768..58b8ff406d2 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
@@ -18,5 +18,4 @@ clz_64 (unsigned int *restrict dst, uint64_t *restrict src, 
int size)
 }
 
 /* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.s, p[0-7]/m, 
z[0-9]+\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, 
z[0-9]+\.d\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, 
z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, 
z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c
index dfb6f4ac7a5..0eba898307c 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c
@@ -18,5 +18,4 @@ popcount_64 (unsigned int *restrict dst, uint64_t *restrict 
src, int size)
 }
 
 /* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.s, p[0-7]/m, 
z[0-9]+\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, 
z[0-9]+\.d\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, 
z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, 
z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c
index 585a522aa81..e6e3c70f927 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c
@@ -1461,4 +1461,4 @@ main ()
   RUN_ALL ()
 }
 
-/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 229 "vect" } } */
+/* { dg-final { scan-tr

[PATCH v1] RISC-V: Fix one range-loop-construct warning of avlprop

2023-10-28 Thread pan2 . li
From: Pan Li 

This patch would like to fix one warning of avlprop as below.

../../gcc/config/riscv/riscv-avlprop.cc: In member function 'virtual
unsigned int pass_avlprop::execute(function*)':
../../gcc/config/riscv/riscv-avlprop.cc:346:23: error: loop variable
'candidate' creates a copy from type 'const std::pair' [-Werror=range-loop-construct]
  346 |   for (const auto candidate : m_candidates)
  |   ^
../../gcc/config/riscv/riscv-avlprop.cc:346:23: note: use reference type
to prevent copying
  346 |   for (const auto candidate : m_candidates)
  |   ^
  |   &

gcc/ChangeLog:

* config/riscv/riscv-avlprop.cc (pass_avlprop::execute): Use
reference type to prevent copying.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-avlprop.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-avlprop.cc 
b/gcc/config/riscv/riscv-avlprop.cc
index 2c79ec81806..c59eb7f6fa3 100644
--- a/gcc/config/riscv/riscv-avlprop.cc
+++ b/gcc/config/riscv/riscv-avlprop.cc
@@ -343,7 +343,7 @@ pass_avlprop::execute (function *fn)
 {
   fprintf (dump_file, "\nNumber of potential AVL propagations: %d\n",
   m_candidates.length ());
-  for (const auto candidate : m_candidates)
+  for (const auto &candidate : m_candidates)
{
  fprintf (dump_file, "\nAVL propagation type: %s\n",
   avlprop_type_to_str (candidate.first));
-- 
2.34.1



[PATCH v3] VECT: Refine the type size restriction of call vectorizer

2023-10-30 Thread pan2 . li
From: Pan Li 

Update in v3:

* Add func to predicate type size is legal or not for vectorizer call.

Update in v2:

* Fix one ICE of type assertion.
* Adjust some test cases for aarch64 sve and riscv vector.

Original log:

The vectoriable_call has one restriction of the size of data type.
Aka DF to DI is allowed but SF to DI isn't. You may see below message
when try to vectorize function call like lrintf.

void
test_lrintf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lrintf (in[i]);
}

lrintf.c:5:26: missed: couldn't vectorize loop
lrintf.c:5:26: missed: not vectorized: unsupported data-type

Then the standard name pattern like lrintmn2 cannot work for different
data type size like SF => DI. This patch would like to refine this data
type size check and unblock the standard name like lrintmn2 on conditions.

The type size of vectype_out need to be exactly the same as the type
size of vectype_in when the vectype_out size isn't participating in
the optab selection. While there is no such restriction when the
vectype_out is somehow a part of the optab query.

The below test are passed for this patch.

* The x86 bootstrap and regression test.
* The aarch64 regression test.
* The risc-v regression tests.
* Ensure the lrintf standard name in risc-v.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_type_size_legal_p): New
func impl to predicate the type size is legal or not.
(vectorizable_call): Leverage vectorizable_type_size_legal_p.

Signed-off-by: Pan Li 
---
 gcc/tree-vect-stmts.cc | 51 +++---
 1 file changed, 38 insertions(+), 13 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index a9200767f67..24b3448d961 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1430,6 +1430,35 @@ vectorizable_internal_function (combined_fn cfn, tree 
fndecl,
   return IFN_LAST;
 }
 
+/* Return TRUE when the type size is legal for the call vectorizer,
+   or FALSE.
+   The type size of both the vectype_in and vectype_out should be
+   exactly the same when vectype_out isn't participating the optab.
+   While there is no restriction for type size when vectype_out
+   is part of the optab query.
+ */
+static bool
+vectorizable_type_size_legal_p (internal_fn ifn, tree vectype_out,
+   tree vectype_in)
+{
+  bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out);
+
+  if (ifn == IFN_LAST || !direct_internal_fn_p (ifn))
+return same_size_p;
+
+  const direct_internal_fn_info &difn_info = direct_internal_fn (ifn);
+
+  if (!difn_info.vectorizable)
+return same_size_p;
+
+  /* According to vectorizable_internal_function, the type0/1 < 0 indicates
+ the vectype_out participating the optable selection.  Aka the type size
+ check can be skipped here.  */
+  if (difn_info.type0 < 0 || difn_info.type1 < 0)
+return true;
+
+  return same_size_p;
+}
 
 static tree permute_vec_elements (vec_info *, tree, tree, tree, stmt_vec_info,
  gimple_stmt_iterator *);
@@ -3361,19 +3390,6 @@ vectorizable_call (vec_info *vinfo,
 
   return false;
 }
-  /* FORNOW: we don't yet support mixtures of vector sizes for calls,
- just mixtures of nunits.  E.g. DI->SI versions of __builtin_ctz*
- are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed
- by a pack of the two vectors into an SI vector.  We would need
- separate code to handle direct VnDI->VnSI IFN_CTZs.  */
-  if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out))
-{
-  if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"mismatched vector sizes %T and %T\n",
-vectype_in, vectype_out);
-  return false;
-}
 
   if (VECTOR_BOOLEAN_TYPE_P (vectype_out)
   != VECTOR_BOOLEAN_TYPE_P (vectype_in))
@@ -3431,6 +3447,15 @@ vectorizable_call (vec_info *vinfo,
 ifn = vectorizable_internal_function (cfn, callee, vectype_out,
  vectype_in);
 
+  if (!vectorizable_type_size_legal_p (ifn, vectype_out, vectype_in))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"mismatched vector sizes %T and %T\n",
+vectype_in, vectype_out);
+  return false;
+}
+
   /* If that fails, try asking for a target-specific built-in function.  */
   if (ifn == IFN_LAST)
 {
-- 
2.34.1



[PATCH v4] VECT: Refine the type size restriction of call vectorizer

2023-10-31 Thread pan2 . li
From: Pan Li 

Update in v4:

* Append the check to vectorizable_internal_function.

Update in v3:

* Add func to predicate type size is legal or not for vectorizer call.

Update in v2:

* Fix one ICE of type assertion.
* Adjust some test cases for aarch64 sve and riscv vector.

Original log:

The vectoriable_call has one restriction of the size of data type.
Aka DF to DI is allowed but SF to DI isn't. You may see below message
when try to vectorize function call like lrintf.

void
test_lrintf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lrintf (in[i]);
}

lrintf.c:5:26: missed: couldn't vectorize loop
lrintf.c:5:26: missed: not vectorized: unsupported data-type

Then the standard name pattern like lrintmn2 cannot work for different
data type size like SF => DI. This patch would like to refine this data
type size check and unblock the standard name like lrintmn2 on conditions.

The type size of vectype_out need to be exactly the same as the type
size of vectype_in when the vectype_out size isn't participating in
the optab selection. While there is no such restriction when the
vectype_out is somehow a part of the optab query.

The below test are passed for this patch.

* The risc-v regression tests.
* Ensure the lrintf standard name in risc-v.

The below test are ongoing.

* The x86 bootstrap and regression test.
* The aarch64 regression test.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_internal_function): Add type
size check for vectype_out doesn't participating for optab query.
(vectorizable_call): Remove the type size check.

Signed-off-by: Pan Li 
---
 gcc/tree-vect-stmts.cc | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index a9200767f67..799b4ab10c7 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1420,8 +1420,17 @@ vectorizable_internal_function (combined_fn cfn, tree 
fndecl,
   const direct_internal_fn_info &info = direct_internal_fn (ifn);
   if (info.vectorizable)
{
+ bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out);
  tree type0 = (info.type0 < 0 ? vectype_out : vectype_in);
  tree type1 = (info.type1 < 0 ? vectype_out : vectype_in);
+
+ /* The type size of both the vectype_in and vectype_out should be
+exactly the same when vectype_out isn't participating the optab.
+While there is no restriction for type size when vectype_out
+is part of the optab query.  */
+ if (type0 != vectype_out && type1 != vectype_out && !same_size_p)
+   return IFN_LAST;
+
  if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1),
  OPTIMIZE_FOR_SPEED))
return ifn;
@@ -3361,19 +3370,6 @@ vectorizable_call (vec_info *vinfo,
 
   return false;
 }
-  /* FORNOW: we don't yet support mixtures of vector sizes for calls,
- just mixtures of nunits.  E.g. DI->SI versions of __builtin_ctz*
- are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed
- by a pack of the two vectors into an SI vector.  We would need
- separate code to handle direct VnDI->VnSI IFN_CTZs.  */
-  if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out))
-{
-  if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"mismatched vector sizes %T and %T\n",
-vectype_in, vectype_out);
-  return false;
-}
 
   if (VECTOR_BOOLEAN_TYPE_P (vectype_out)
   != VECTOR_BOOLEAN_TYPE_P (vectype_in))
-- 
2.34.1



[PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]

2023-11-01 Thread pan2 . li
From: Pan Li 

The extract_low_bits only try the scalar mode if the bitsize of
the mode and src_mode is not equal. When vector mode is given
from get_stored_val in DSE, it will always fail and return NULL_RTX.

This patch would like to allow the vector mode in the extract_low_bits
if and only if the size of mode is less than or equals to the size of
the src_mode.

Given below example code with --param=riscv-autovec-preference=fixed-vlmax.

vuint8m1_t test () {
  uint8_t arr[32] = {
1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
  };

  return __riscv_vle8_v_u8m1(arr, 32);
}

Before this patch:

test:
  lui a5,%hi(.LANCHOR0)
  addisp,sp,-32
  addia5,a5,%lo(.LANCHOR0)
  li  a3,32
  vl2re64.v   v2,0(a5)
  vsetvli zero,a3,e8,m1,ta,ma
  vs2r.v  v2,0(sp) <== Unnecessary store to stack
  vle8.v  v1,0(sp) <== Ditto
  vs1r.v  v1,0(a0)
  addisp,sp,32
  jr  ra

After this patch:

test:
  lui a5,%hi(.LANCHOR0)
  addia5,a5,%lo(.LANCHOR0)
  li  a4,32
  addisp,sp,-32
  vsetvli zero,a4,e8,m1,ta,ma
  vle8.v  v1,0(a5)
  vs1r.v  v1,0(a0)
  addisp,sp,32
  jr  ra

Below tests are passed within this patch:

* The x86 bootstrap and regression test.
* The aarch64 regression test.
* The risc-v regression test.

PR target/111720

gcc/ChangeLog:

* expmed.cc (extract_low_bits): Allow vector mode if the
mode size is less than or equal to src_mode.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
* gcc.target/riscv/rvv/base/pr111720-9.c: New test.

Signed-off-by: Pan Li 
---
 gcc/expmed.cc | 44 ---
 .../gcc.target/riscv/rvv/base/pr111720-0.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-1.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 
 .../gcc.target/riscv/rvv/base/pr111720-2.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-3.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-4.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-5.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-6.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-7.c| 21 +
 .../gcc.target/riscv/rvv/base/pr111720-8.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-9.c| 15 +++
 12 files changed, 227 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index b294eabb08d..5db83fe638c 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -2403,8 +2403,6 @@ extract_split_bit_field (rtx op0, opt_scalar_int_mode 
op0_mode,
 rtx
 extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src)
 {
-  scalar_int_mode int_mode, src_int_mode;
-
   if (mode == src_mode)
 return src;
 
@@ -2437,22 +2435,38 @@ extract_low_bits (machine_mode mode, machine_mode 
src_mode, rtx src)
 return x;
 }
 
-  if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
-  || !int_mode_for_mode (mode).exists (&int_mode))
-return NULL_RTX;
+  if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))
+{
+  if (maybe_gt (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (src_mode))
+   || !targetm.modes_tieable_p (mode, src_mode))
+   return NULL_RTX;
 
-  if (!targetm.modes_tieable_p (src_int_mode, src_mode))
-return NULL_RTX;
-  if (!targetm.modes_tieable_p (int_mode, mode))
-return NULL_RTX;
+  /* For vector mode,  only the bitsize (mode) <= bitsize (src_mode) and
+tieable is allowed here.  */
+  src = gen_lowpart (mode, src);
+}
+  else
+{

[PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

2023-11-02 Thread pan2 . li
From: Pan Li 

The previous rounding API start with i/l/ll only works on the same
mode types. For example as below, and we arrange the iterator similar
to fcvt.

* SF => SI
* DF => DI

After we refined this limination from middle-end, these API can also
vectorized with different type sizes, aka:

* HF => SI, HF => DI
* SF => DI, SF => SI
* DF => SI, DF => DI

Then the iterator cannot take care of this simply and this patch
would like to re-arrange the iterator in two items.

* V_VLS_F_CONVERT_SI: handle (HF, SF, DF) => SI
* V_VLS_F_CONVERT_DI: handle (HF, SF, DF) => DI

As well as related mode_attr to reconcile the new iterator.

gcc/ChangeLog:

* config/riscv/autovec.md (lrint2): Remove.
(lround2): Ditto.
(lceil2): Ditto.
(lfloor2): Ditto.
(lrint2): New pattern for cvt from
FP to SI.
(lround2): Ditto.
(lceil2): Ditto.
(lfloor2): Ditto.
(lrint2): New pattern for cvt from
FP to DI.
(lround2): Ditto.
(lceil2): Ditto.
(lfloor2): Ditto.
* config/riscv/vector-iterators.md: Renew iterators for both
the SI and DI.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md  |  72 +++---
 gcc/config/riscv/vector-iterators.md | 199 ---
 2 files changed, 237 insertions(+), 34 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index f5e3e347ace..81acb1a815b 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2395,42 +2395,82 @@ (define_expand "roundeven2"
   }
 )
 
-(define_expand "lrint2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+(define_expand "lrint2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
 
-(define_expand "lround2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+(define_expand "lrint2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
 
-(define_expand "lceil2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+(define_expand "lround2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
 
-(define_expand "lfloor2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+(define_expand "lround2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lceil2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lceil2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lfloor2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lfloor2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index d9b5dec5edb..f2d9f60b631 100644
--- a/gcc/config/riscv/vector-iter

[PATCH v2] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

2023-11-02 Thread pan2 . li
From: Pan Li 

Update in v2:

* Add mode size equal check to disable different mode size when expand,
  because the underlying codegen is not implemented yet.

Original log:

The previous rounding API start with i/l/ll only works on the same
mode types. For example as below, and we arrange the iterator similar
to fcvt.

* SF => SI
* DF => DI

After we refined this limination from middle-end, these API can also
vectorized with different type sizes, aka:

* HF => SI, HF => DI
* SF => DI, SF => SI
* DF => SI, DF => DI

Then the iterator cannot take care of this simply and this patch
would like to re-arrange the iterator in two items.

* V_VLS_F_CONVERT_SI: handle (HF, SF, DF) => SI
* V_VLS_F_CONVERT_DI: handle (HF, SF, DF) => DI

As well as related mode_attr to reconcile the new iterator.

gcc/ChangeLog:

* config/riscv/autovec.md (lrint2): Remove.
(lround2): Ditto.
(lceil2): Ditto.
(lfloor2): Ditto.
(lrint2): New pattern for cvt from
FP to SI.
(lround2): Ditto.
(lceil2): Ditto.
(lfloor2): Ditto.
(lrint2): New pattern for cvt from
FP to DI.
(lround2): Ditto.
(lceil2): Ditto.
(lfloor2): Ditto.
* config/riscv/vector-iterators.md: Renew iterators for both
the SI and DI.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md  |  90 +---
 gcc/config/riscv/vector-iterators.md | 199 ---
 2 files changed, 251 insertions(+), 38 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index f5e3e347ace..cc4c9596bbf 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2395,42 +2395,92 @@ (define_expand "roundeven2"
   }
 )
 
-(define_expand "lrint2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+;; Add mode_size equal check as we opened the modes for different sizes.
+;; The check will be removed soon after related codegen implemented
+(define_expand "lrint2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE 
(mode))"
   {
-riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
 
-(define_expand "lround2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+(define_expand "lrint2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE 
(mode))"
   {
-riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
 
-(define_expand "lceil2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+(define_expand "lround2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE 
(mode))"
   {
-riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
 
-(define_expand "lfloor2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+(define_expand "lround2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE 
(mode))"
+  {
+riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lceil2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE 
(mode))"
+  {
+riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lceil2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE 
(mode))

[PATCH v1] RISC-V: Remove HF modes of FP to INT rounding autovec

2023-11-03 Thread pan2 . li
From: Pan Li 

The [i|l|ll][rint|round|ceil|floor] internal functions are
defined as DEF_INTERNAL_FLT_FN instead of DEF_INTERNAL_FLT_FLOATN_FN.
Then the *f16 (N=16 of FLOATN) format of these functions are not
available when try to get the ifn from the given cfn in the
vectorizable_call. Aka:

BUILT_IN_LRINTF16 => IFN_LAST (should be IFN_LRINT here)
BUILT_IN_RINTF16 => IFN_RINT

It is better to remove FP16 related modes until the additional
middle-end support is ready. This patch would like to clean the FP16
modes with some comments.

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Remove HF modes.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/vector-iterators.md | 59 +---
 1 file changed, 2 insertions(+), 57 deletions(-)

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index f2d9f60b631..e80eaedc4b3 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -3221,20 +3221,15 @@ (define_mode_attr vnnconvert [
 ;; V_F2SI_CONVERT: (HF, SF, DF) => SI
 ;; V_F2DI_CONVERT: (HF, SF, DF) => DI
 ;;
+;; HF requires additional support from internal function, aka
+;; gcc/internal-fn.def, remove HF shortly until the middle-end is ready.
 (define_mode_attr V_F2SI_CONVERT [
-  (RVVM4HF "RVVM8SI") (RVVM2HF "RVVM4SI") (RVVM1HF "RVVM2SI")
-  (RVVMF2HF "RVVM1SI") (RVVMF4HF "RVVMF2SI")
-
   (RVVM8SF "RVVM8SI") (RVVM4SF "RVVM4SI") (RVVM2SF "RVVM2SI")
   (RVVM1SF "RVVM1SI") (RVVMF2SF "RVVMF2SI")
 
   (RVVM8DF "RVVM4SI") (RVVM4DF "RVVM2SI") (RVVM2DF "RVVM1SI")
   (RVVM1DF "RVVMF2SI")
 
-  (V1HF "V1SI") (V2HF "V2SI") (V4HF "V4SI") (V8HF "V8SI") (V16HF "V16SI")
-  (V32HF "V32SI") (V64HF "V64SI") (V128HF "V128SI") (V256HF "V256SI")
-  (V512HF "V512SI") (V1024HF "V1024SI")
-
   (V1SF "V1SI") (V2SF "V2SI") (V4SF "V4SI") (V8SF "V8SI") (V16SF "V16SI")
   (V32SF "V32SI") (V64SF "V64SI") (V128SF "V128SI") (V256SF "V256SI")
   (V512SF "V512SI") (V1024SF "V1024SI")
@@ -3245,19 +3240,12 @@ (define_mode_attr V_F2SI_CONVERT [
 ])
 
 (define_mode_attr v_f2si_convert [
-  (RVVM4HF "rvvm8si") (RVVM2HF "rvvm4si") (RVVM1HF "rvvm2si")
-  (RVVMF2HF "rvvm1si") (RVVMF4HF "rvvmf2si")
-
   (RVVM8SF "rvvm8si") (RVVM4SF "rvvm4si") (RVVM2SF "rvvm2si")
   (RVVM1SF "rvvm1si") (RVVMF2SF "rvvmf2si")
 
   (RVVM8DF "rvvm4si") (RVVM4DF "rvvm2si") (RVVM2DF "rvvm1si")
   (RVVM1DF "rvvmf2si")
 
-  (V1HF "v1si") (V2HF "v2si") (V4HF "v4si") (V8HF "v8si") (V16HF "v16si")
-  (V32HF "v32si") (V64HF "v64si") (V128HF "v128si") (V256HF "v256si")
-  (V512HF "v512si") (V1024HF "v1024si")
-
   (V1SF "v1si") (V2SF "v2si") (V4SF "v4si") (V8SF "v8si") (V16SF "v16si")
   (V32SF "v32si") (V64SF "v64si") (V128SF "v128si") (V256SF "v256si")
   (V512SF "v512si") (V1024SF "v1024si")
@@ -3268,9 +3256,6 @@ (define_mode_attr v_f2si_convert [
 ])
 
 (define_mode_iterator V_VLS_F_CONVERT_SI [
-  (RVVM4HF "TARGET_ZVFH") (RVVM2HF "TARGET_ZVFH") (RVVM1HF "TARGET_ZVFH")
-  (RVVMF2HF "TARGET_ZVFH") (RVVMF4HF "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
-
   (RVVM8SF "TARGET_VECTOR_ELEN_FP_32") (RVVM4SF "TARGET_VECTOR_ELEN_FP_32")
   (RVVM2SF "TARGET_VECTOR_ELEN_FP_32") (RVVM1SF "TARGET_VECTOR_ELEN_FP_32")
   (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
@@ -3280,18 +3265,6 @@ (define_mode_iterator V_VLS_F_CONVERT_SI [
   (RVVM2DF "TARGET_VECTOR_ELEN_FP_64")
   (RVVM1DF "TARGET_VECTOR_ELEN_FP_64")
 
-  (V1HF "riscv_vector::vls_mode_valid_p (V1HFmode) && TARGET_ZVFH")
-  (V2HF "riscv_vector::vls_mode_valid_p (V2HFmode) && TARGET_ZVFH")
-  (V4HF "riscv_vector::vls_mode_valid_p (V4HFmode) && TARGET_ZVFH")
-  (V8HF "riscv_vector::vls_mode_valid_p (V8HFmode) && TARGET_ZVFH")
-  (V16HF "riscv_vector::vls_mode_valid_p (V16HFmode) && TARGET_ZVFH")
-  (V32HF "riscv_vector::vls_mode_valid_p (V32HFmode) && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 64")
-  (V64HF "riscv_vector::vls_mode_valid_p (V64HFmode) && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 128")
-  (V128HF "riscv_vector::vls_mode_valid_p (V128HFmode) && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 256")
-  (V256HF "riscv_vector::vls_mode_valid_p (V256HFmode) && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 512")
-  (V512HF "riscv_vector::vls_mode_valid_p (V512HFmode) && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 1024")
-  (V1024HF "riscv_vector::vls_mode_valid_p (V1024HFmode) && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 2048")
-
   (V1SF "riscv_vector::vls_mode_valid_p (V1SFmode) && 
TARGET_VECTOR_ELEN_FP_32")
   (V2SF "riscv_vector::vls_mode_valid_p (V2SFmode) && 
TARGET_VECTOR_ELEN_FP_32")
   (V4SF "riscv_vector::vls_mode_valid_p (V4SFmode) && 
TARGET_VECTOR_ELEN_FP_32")
@@ -3317,19 +3290,12 @@ (define_mode_iterator V_VLS_F_CONVERT_SI [
 ])
 
 (define_mode_attr V_F2DI_CONVERT [
-  (RVVM2HF "RVVM8DI") (RVVM1HF "RVVM4DI") (RVVMF2HF "RVVM2DI")
-  (RVVMF4HF "RVVM1DI")
-
   (RVVM4SF "RVVM8DI") (RVVM2SF "RVVM4DI") (RVVM1SF "RVVM2DI")
   (RVVMF2SF "RVVM1DI")
 
   (RVVM8DF "RVVM8DI") (RVVM4DF "RVVM4DI") (RVVM2DF "RVVM2DI")
   (RVVM1DF "RVVM1DI")
 
-  (V1HF "V1DI") (V2HF "V2DI") (V

[PATCH v1] RISC-V: Support FP rint to i/l/ll diff size autovec

2023-11-05 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP below API auto vectorization
with different type size

+-+---+--+
| API | RV64  | RV32 |
+-+---+--+
| irint   | DF => SI  | DF => SI |
| irintf  | - | -|
| lrint   | - | DF => SI |
| lrintf  | SF => DI  | -|
| llrint  | - | -|
| llrintf | SF => DI  | SF => DI |
+-+---+--+

Given below code:
void
test_lrintf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lrintf (in[i]);
}

Before this patch:
test_lrintf:
  beq a2,zero,.L8
  sllia5,a2,32
  srlia2,a5,30
  add a4,a1,a2
.L3:
  flw fa5,0(a1)
  addia1,a1,4
  addia0,a0,8
  fcvt.l.s a5,fa5,dyn
  sd  a5,-8(a0)
  bne a1,a4,.L3

After this patch:
test_lrintf:
  beq a2,zero,.L8
  sllia2,a2,32
  srlia2,a2,32
.L3:
  vsetvli a5,a2,e32,mf2,ta,ma
  vle32.v v2,0(a1)
  sllia3,a5,2
  sllia4,a5,3
  vfwcvt.x.f.vv1,v2
  sub a2,a2,a5
  vse64.v v1,0(a0)
  add a1,a1,a3
  add a0,a0,a4
  bne a2,zero,.L3

Unfortunately, the HF mode is not include due to it requires
additional middle-end support from internal-fun.def.

gcc/ChangeLog:

* config/riscv/autovec.md: Remove the size check of lrint.
* config/riscv/riscv-v.cc (emit_vec_narrow_cvt_x_f): New help
emit func impl.
(emit_vec_widden_cvt_x_f): New help emit func impl.
(emit_vec_rounding_to_integer): New func impl to emit the
rounding from FP to integer.
(expand_vec_lrint): Leverage emit_vec_rounding_to_integer.
* config/riscv/vector.md: Take V_VLSF for vfncvt.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c:
* gcc.target/riscv/rvv/autovec/unop/math-irint-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-irintf-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llrintf-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrintf-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrintf-rv64-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-irint-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llrintf-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lrint-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lrintf-rv64-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   |  6 +-
 gcc/config/riscv/riscv-v.cc   | 46 +-
 gcc/config/riscv/vector.md|  2 +-
 .../riscv/rvv/autovec/unop/math-irint-1.c | 13 +++
 .../riscv/rvv/autovec/unop/math-irint-run-0.c | 92 +--
 .../rvv/autovec/unop/math-irintf-run-0.c  | 63 +
 .../riscv/rvv/autovec/unop/math-llrintf-0.c   | 13 +++
 .../rvv/autovec/unop/math-llrintf-run-0.c | 63 +
 .../rvv/autovec/unop/math-lrint-rv32-0.c  | 13 +++
 .../rvv/autovec/unop/math-lrint-rv32-run-0.c  | 63 +
 .../rvv/autovec/unop/math-lrintf-rv64-0.c | 13 +++
 .../rvv/autovec/unop/math-lrintf-rv64-run-0.c | 63 +
 .../riscv/rvv/autovec/vls/math-irint-1.c  | 30 ++
 .../riscv/rvv/autovec/vls/math-llrintf-0.c| 30 ++
 .../riscv/rvv/autovec/vls/math-lrint-rv32-0.c | 30 ++
 .../rvv/autovec/vls/math-lrintf-rv64-0.c  | 30 ++
 16 files changed, 514 insertions(+), 56 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irintf-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrintf-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrintf-rv64-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lrintf-rv64-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llrintf-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lrint-rv32-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lrintf-rv64-0.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index cc4c9596bbf..f1f0523d1de 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/a

[PATCH v1] RISC-V: Adjust FP rint round tests for RV32

2023-11-06 Thread pan2 . li
From: Pan Li 

The FP rint test cases for RV32 need some additional adjust
for types and data. This patch would like to fix this which
is missed in FP rint support PATCH for RV32 only by mistake.

Please note the math-llrintf-run-0.c will trigger one ICE in the
vsetvl pass in RV32 only.

./riscv32-unknown-elf-gcc -march=rv32gcv -mabi=ilp32d \
  -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math \
  gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c \
  -o test.elf -lm

Then there will have ICE similar as below, and will file bugzilla for it.

config/riscv/riscv-v.cc:4314
   65 | }
  | ^
0x1fa5223 riscv_vector::validate_change_or_fail(rtx_def*, rtx_def**,
rtx_def*, bool)

/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-v.cc:4314
0x1fb1aa2 pre_vsetvl::remove_avl_operand()

/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3342
0x1fb18c1 pre_vsetvl::cleaup()

/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3308
0x1fb216d pass_vsetvl::lazy_vsetvl()

/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3480
0x1fb2214 pass_vsetvl::execute(function*)

/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3504

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: Adjust
test cases.
* gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-run-0.c: Ditto.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/unop/math-irint-run-0.c | 94 +-
 .../rvv/autovec/unop/math-llrintf-run-0.c | 98 ++-
 .../rvv/autovec/unop/math-lrint-rv32-run-0.c  | 88 -
 3 files changed, 141 insertions(+), 139 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
index 43bc0849695..aae1d95c2b6 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
@@ -5,59 +5,59 @@
 
 #define ARRAY_SIZE 128
 
-float in[ARRAY_SIZE];
-long out[ARRAY_SIZE];
-long ref[ARRAY_SIZE];
+double in[ARRAY_SIZE];
+int out[ARRAY_SIZE];
+int ref[ARRAY_SIZE];
 
-TEST_UNARY_CALL_CVT (float, long, __builtin_lrintf)
-TEST_ASSERT (long)
+TEST_UNARY_CALL_CVT (double, int, __builtin_irint)
+TEST_ASSERT (int)
 
-TEST_INIT_CVT (float, 1.2, long, __builtin_lrintf (1.2), 1)
-TEST_INIT_CVT (float, -1.2, long, __builtin_lrintf (-1.2), 2)
-TEST_INIT_CVT (float, 0.5, long, __builtin_lrintf (0.5), 3)
-TEST_INIT_CVT (float, -0.5, long, __builtin_lrintf (-0.5), 4)
-TEST_INIT_CVT (float, 0.1, long, __builtin_lrintf (0.1), 5)
-TEST_INIT_CVT (float, -0.1, long, __builtin_lrintf (-0.1), 6)
-TEST_INIT_CVT (float, 3.0, long, __builtin_lrintf (3.0), 7)
-TEST_INIT_CVT (float, -3.0, long, __builtin_lrintf (-3.0), 8)
-TEST_INIT_CVT (float, 4503599627370495.5, long, __builtin_lrintf 
(4503599627370495.5), 9)
-TEST_INIT_CVT (float, 4503599627370497.0, long, __builtin_lrintf 
(4503599627370497.0), 10)
-TEST_INIT_CVT (float, -4503599627370495.5, long, __builtin_lrintf 
(-4503599627370495.5), 11)
-TEST_INIT_CVT (float, -4503599627370496.0, long, __builtin_lrintf 
(-4503599627370496.0), 12)
-TEST_INIT_CVT (float, 0.0, long, __builtin_lrintf (-0.0), 13)
-TEST_INIT_CVT (float, -0.0, long, __builtin_lrintf (-0.0), 14)
-TEST_INIT_CVT (float, 9223372036854774784.0, long, __builtin_lrintf 
(9223372036854774784.0), 15)
-TEST_INIT_CVT (float, 9223372036854775808.0, long, __builtin_lrintf 
(9223372036854775808.0), 16)
-TEST_INIT_CVT (float, -9223372036854775808.0, long, __builtin_lrintf 
(-9223372036854775808.0), 17)
-TEST_INIT_CVT (float, -9223372036854777856.0, long, __builtin_lrintf 
(-9223372036854777856.0), 18)
-TEST_INIT_CVT (float, __builtin_inf (), long, __builtin_lrintf (__builtin_inf 
()), 19)
-TEST_INIT_CVT (float, -__builtin_inf (), long, __builtin_lrintf 
(-__builtin_inf ()), 20)
-TEST_INIT_CVT (float, __builtin_nan (""), long, 0x7fff, 21)
+TEST_INIT_CVT (double, 1.2, int, __builtin_irint (1.2), 1)
+TEST_INIT_CVT (double, -1.2, int, __builtin_irint (-1.2), 2)
+TEST_INIT_CVT (double, 0.5, int, __builtin_irint (0.5), 3)
+TEST_INIT_CVT (double, -0.5, int, __builtin_irint (-0.5), 4)
+TEST_INIT_CVT (double, 0.1, int, __builtin_irint (0.1), 5)
+TEST_INIT_CVT (double, -0.1, int, __builtin_irint (-0.1), 6)
+TEST_INIT_CVT (double, 3.0, int, __builtin_irint (3.0), 7)
+TEST_INIT_CVT (double, -3.0, int, __builtin_irint (-3.0), 8)
+TEST_INIT_CVT (double, 4503599627370495.5, int, __builtin_irint 
(4503599627370495.5), 9)
+TEST_INIT_CVT (double, 4503599627370497.0, int, __builtin_irint 
(4503599627370497.0), 10)
+TEST_INIT_CVT (double, -4503599

[PATCH v1] RISC-V: Support FP round to i/l/ll diff size autovec

2023-11-06 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP below API auto vectorization
with different type size

+--+---+--+
| API  | RV64  | RV32 |
+--+---+--+
| iround   | DF => SI  | DF => SI |
| iroundf  | - | -|
| lround   | - | DF => SI |
| lroundf  | SF => DI  | -|
| llround  | - | -|
| llroundf | SF => DI  | SF => DI |
+--+---+--+

Given below code:
void
test_lroundf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lroundf (in[i]);
}

Before this patch:
.L3:
  flw  fa5,0(a1)
  addi a1,a1,4
  addi a0,a0,8
  fcvt.l.s a5,fa5,rmm
  sd   a5,-8(a0)
  bne  a4,a1,.L3

After this patch:
  fsrmi4  // RMM rounding mode
  vsetivli zero,16,e32,m4,ta,ma
.L4:
  vle32.v  v4,0(a5)
  addi a5,a5,64
  vfwcvt.x.f.v v8,v4
  vse64.v  v8,0(a4)
  addi a4,a4,128
  bne  a3,a5,.L4
  andi a5,a2,15
  andi a4,a2,-16
  beq  a5,zero,.L16

Unfortunately, the HF mode is not include due to it requires
additional middle-end support from internal-fun.def.

gcc/ChangeLog:

* config/riscv/autovec.md: Remove the size check of lround.
* config/riscv/riscv-v.cc (expand_vec_lround): Leverage
emit_vec_rounding_to_integer for round.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-iround-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iround-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llroundf-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llroundf-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iround-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llroundf-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lround-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lroundf-rv64-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   |  6 +-
 gcc/config/riscv/riscv-v.cc   |  8 +-
 .../riscv/rvv/autovec/unop/math-iround-1.c| 18 
 .../rvv/autovec/unop/math-iround-run-1.c  | 83 ++
 .../riscv/rvv/autovec/unop/math-llroundf-0.c  | 19 +
 .../rvv/autovec/unop/math-llroundf-run-0.c| 84 +++
 .../rvv/autovec/unop/math-lround-rv32-0.c | 18 
 .../rvv/autovec/unop/math-lround-rv32-run-0.c | 83 ++
 .../rvv/autovec/unop/math-lroundf-rv64-0.c| 18 
 .../autovec/unop/math-lroundf-rv64-run-0.c| 84 +++
 .../riscv/rvv/autovec/vls/math-iround-1.c | 27 ++
 .../riscv/rvv/autovec/vls/math-llroundf-0.c   | 27 ++
 .../rvv/autovec/vls/math-lround-rv32-0.c  | 27 ++
 .../rvv/autovec/vls/math-lroundf-rv64-0.c | 27 ++
 14 files changed, 520 insertions(+), 9 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iround-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llroundf-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llroundf-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-iround-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llroundf-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lround-rv32-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lroundf-rv64-0.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index f1f0523d1de..d1804d82552 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2420,8 +2420,7 @@ (define_expand "lrint2"
 (define_expand "lround2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
-&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE 
(mode))"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
 riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
 DONE;
@@ -24

[PATCH v1] RISC-V: Cleanup unused code in riscv_v_adjust_bytesize [NFC]

2024-03-05 Thread pan2 . li
From: Pan Li 

Cleanup mode_size related code which is not used anymore. Below tests are
passed for this patch.

* The RVV fully regresssion test.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_v_adjust_bytesize): Cleanup unused
mode_size related code.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 4 
 1 file changed, 4 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 56cd8d2c23f..691d967de29 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1527,10 +1527,6 @@ riscv_v_adjust_bytesize (machine_mode mode, int scale)
return BYTES_PER_RISCV_VECTOR;
 
   poly_int64 nunits = GET_MODE_NUNITS (mode);
-  poly_int64 mode_size = GET_MODE_SIZE (mode);
-
-  if (maybe_eq (mode_size, (uint16_t) -1))
-   mode_size = riscv_vector_chunks * scale;
 
   if (nunits.coeffs[0] > 8)
return exact_div (nunits, 8);
-- 
2.34.1



[PATCH v1] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-05 Thread pan2 . li
From: Pan Li 

This patch would like to introduce one new gcc attribute for RVV.
This attribute is used to define fixed-length variants of one
existing sizeless RVV types.

This attribute is valid if and only if the mrvv-vector-bits=zvl, the only
one args should be the integer constant and its' value is terminated
by the LMUL and the vector register bits in zvl*b.  For example:

typedef vint32m2_t fixed_vint32m2_t __attribute__((riscv_rvv_vector_bits(128)));

The above type define is invalid when -march=rv64gc_zve64d_zvl64b
(aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when
-march=rv64gcv_zvl128b similar to below.

"error: invalid RVV vector size '128', expected size is '256' based on
LMUL of type and '-mrvv-vector-bits=zvl'"

For the vint*m*_t below operations are allowed.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.
* CMP: >, <, ==, !=, <=, >=
* ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, -

For the vfloat*m*_t below operations are allowed.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.
* CMP: >, <, ==, !=, <=, >=
* ALU: +, -, *, /, -

For the vbool*_t types only below operations are allowed except
the CMP and ALU. The CMP and ALU operations on vbool*_t is not
well defined currently.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.

For the vint*x*m*_t tuple types are not suppored in this patch
which is compatible with clang.

This patch passed the below testsuites.
* The riscv fully regression tests.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute):
New static func to take care of the RVV types decorated by
the attributes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc |  88 +-
 .../riscv/rvv/base/riscv_rvv_vector_bits-1.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-10.c |  53 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-11.c |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits-12.c |  14 +++
 .../riscv/rvv/base/riscv_rvv_vector_bits-2.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-3.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-4.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-5.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-6.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-7.c  |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits-8.c  |  75 
 .../riscv/rvv/base/riscv_rvv_vector_bits-9.c  |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits.h| 108 ++
 14 files changed, 600 insertions(+), 2 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 56cd8d2c23f..fdbaf1633ac 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/r

[PATCH v2] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-05 Thread pan2 . li
From: Pan Li 

Update in v2:
* Cleanup some unused code.
* Fix some typo of commit log.

Original log:

This patch would like to introduce one new gcc attribute for RVV.
This attribute is used to define fixed-length variants of one
existing sizeless RVV types.

This attribute is valid if and only if the mrvv-vector-bits=zvl, the only
one args should be the integer constant and its' value is terminated
by the LMUL and the vector register bits in zvl*b.  For example:

typedef vint32m2_t fixed_vint32m2_t __attribute__((riscv_rvv_vector_bits(128)));

The above type define is valid when -march=rv64gc_zve64d_zvl64b
(aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when
-march=rv64gcv_zvl128b similar to below.

"error: invalid RVV vector size '128', expected size is '256' based on
LMUL of type and '-mrvv-vector-bits=zvl'"

For the vint*m*_t below operations are allowed.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.
* CMP: >, <, ==, !=, <=, >=
* ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, -

For the vfloat*m*_t below operations are allowed.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.
* CMP: >, <, ==, !=, <=, >=
* ALU: +, -, *, /, -

For the vbool*_t types only below operations are allowed except
the CMP and ALU. The CMP and ALU operations on vbool*_t is not
well defined currently.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.

For the vint*x*m*_t tuple types are not suppored in this patch
which is compatible with clang.

This patch passed the below testsuites.
* The riscv fully regression tests.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute):
New static func to take care of the RVV types decorated by
the attributes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc |  87 +-
 .../riscv/rvv/base/riscv_rvv_vector_bits-1.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-10.c |  53 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-11.c |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits-12.c |  14 +++
 .../riscv/rvv/base/riscv_rvv_vector_bits-2.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-3.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-4.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-5.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-6.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-7.c  |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits-8.c  |  75 
 .../riscv/rvv/base/riscv_rvv_vector_bits-9.c  |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits.h| 108 ++
 14 files changed, 599 insertions(+), 2 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
i

[PATCH v1] VECT: Bugfix ICE for vectorizable_store when both len and mask

2024-03-07 Thread pan2 . li
From: Pan Li 

This patch would like to fix one ICE in vectorizable_store for both the
loop_masks and loop_lens.  The ICE looks like below with "-march=rv64gcv -O3".

during GIMPLE pass: vect
test.c: In function ‘d’:
test.c:6:6: internal compiler error: in vectorizable_store, at
tree-vect-stmts.cc:8691
6 | void d() {
  |  ^
0x37a6f2f vectorizable_store
.../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691
0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*,
_slp_tree*, _slp_instance*, vec*)
.../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242
0x1db5dca vect_analyze_loop_operations
.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208
0x1db885b vect_analyze_loop_2
.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041
0x1dba029 vect_analyze_loop_1
.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481
0x1dbabad vect_analyze_loop(loop*, vec_info_shared*)
.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639
0x1e389d1 try_vectorize_loop_1
.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066
0x1e38f3d try_vectorize_loop
.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182
0x1e39230 execute
.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298

Given the masks and the lens cannot be enabled simultanously when loop is
using partial vectors.  Thus, we need to ensure the one is disabled when we
would like to record the other in check_load_store_for_partial_vectors.  For
example, when we try to record loop len, we need to check if the loop mask
is disabled or not.

Below testsuites are passed for this patch:
* The x86 bootstrap tests.
* The x86 fully regression tests.
* The aarch64 fully regression tests.
* The riscv fully regressison tests.

PR target/114195

gcc/ChangeLog:

* tree-vect-stmts.cc (check_load_store_for_partial_vectors): Add
loop mask/len check before recording as they are mutual exclusion.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr114195-1.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/base/pr114195-1.c| 15 +++
 gcc/tree-vect-stmts.cc| 26 ++-
 2 files changed, 35 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
new file mode 100644
index 000..b0c9d5b81b8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
@@ -0,0 +1,15 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */
+
+long a, b;
+extern short c[];
+
+void d() {
+  for (int e = 0; e < 35; e += 2) {
+a = ({ a < 0 ? a : 0; });
+b = ({ b < 0 ? b : 0; });
+
+c[e] = 0;
+  }
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 14a3ffb5f02..624947ed271 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1502,6 +1502,8 @@ check_load_store_for_partial_vectors (loop_vec_info 
loop_vinfo, tree vectype,
  gather_scatter_info *gs_info,
  tree scalar_mask)
 {
+  gcc_assert (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo));
+
   /* Invariant loads need no special support.  */
   if (memory_access_type == VMAT_INVARIANT)
 return;
@@ -1521,9 +1523,17 @@ check_load_store_for_partial_vectors (loop_vec_info 
loop_vinfo, tree vectype,
   internal_fn ifn
= (is_load ? vect_load_lanes_supported (vectype, group_size, true)
   : vect_store_lanes_supported (vectype, group_size, true));
-  if (ifn == IFN_MASK_LEN_LOAD_LANES || ifn == IFN_MASK_LEN_STORE_LANES)
+
+  /* When the loop_vinfo using partial vector,  we cannot enable both
+the fully mask and length simultaneously.  Thus, make sure the
+other one is disabled when record one of them.
+The same as other place for both the vect_record_loop_len and
+vect_record_loop_mask.  */
+  if ((ifn == IFN_MASK_LEN_LOAD_LANES || ifn == IFN_MASK_LEN_STORE_LANES)
+   && !LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
vect_record_loop_len (loop_vinfo, lens, nvectors, vectype, 1);
-  else if (ifn == IFN_MASK_LOAD_LANES || ifn == IFN_MASK_STORE_LANES)
+  else if ((ifn == IFN_MASK_LOAD_LANES || ifn == IFN_MASK_STORE_LANES)
+   && !LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype,
   scalar_mask);
   else
@@ -1549,12 +1559,14 @@ check_load_store_for_partial_vectors (loop_vec_info 
loop_vinfo, tree vectype,
   if (internal_gather_scatter_fn_supported_p (len_ifn, vectype,
  gs_info->memory_type,
  gs_info->offset_vectype,
- 

[PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled

2024-03-09 Thread pan2 . li
From: Pan Li 

This patch would like to fix one ICE in vectorizable_store when both the
loop_masks and loop_lens are enabled.  The ICE looks like below when build
with "-march=rv64gcv -O3".

during GIMPLE pass: vect
test.c: In function ‘d’:
test.c:6:6: internal compiler error: in vectorizable_store, at
tree-vect-stmts.cc:8691
6 | void d() {
  |  ^
0x37a6f2f vectorizable_store
.../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691
0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*,
_slp_tree*, _slp_instance*, vec*)
.../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242
0x1db5dca vect_analyze_loop_operations
.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208
0x1db885b vect_analyze_loop_2
.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041
0x1dba029 vect_analyze_loop_1
.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481
0x1dbabad vect_analyze_loop(loop*, vec_info_shared*)
.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639
0x1e389d1 try_vectorize_loop_1
.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066
0x1e38f3d try_vectorize_loop
.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182
0x1e39230 execute
.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298

There are two ways to reach vectorizer LD/ST, one is the analysis and
the other is transform.  We cannot have both the lens and the masks
enabled during transform but it is valid during analysis.  Given the
transform doesn't required cost_vec,  we can only enable the assert
based on cost_vec is NULL or not.

Below testsuites are passed for this patch:
* The x86 bootstrap tests.
* The x86 fully regression tests.
* The aarch64 fully regression tests.
* The riscv fully regressison tests.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Enable the assert
during transform process.
(vectorizable_load): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr114195-1.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/base/pr114195-1.c | 15 +++
 gcc/tree-vect-stmts.cc | 18 ++
 2 files changed, 29 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
new file mode 100644
index 000..a67b847112b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
@@ -0,0 +1,15 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */
+
+long a, b;
+extern short c[];
+
+void d() {
+  for (int e = 0; e < 35; e = 2) {
+a = ({ a < 0 ? a : 0; });
+b = ({ b < 0 ? b : 0; });
+
+c[e] = 0;
+  }
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 14a3ffb5f02..e8617439a48 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -8697,8 +8697,13 @@ vectorizable_store (vec_info *vinfo,
? &LOOP_VINFO_LENS (loop_vinfo)
: NULL);
 
-  /* Shouldn't go with length-based approach if fully masked.  */
-  gcc_assert (!loop_lens || !loop_masks);
+  /* The vect_transform_stmt and vect_analyze_stmt will go here but there
+ are some difference here.  We cannot enable both the lens and masks
+ during transform but it is allowed during analysis.
+ Shouldn't go with length-based approach if fully masked.  */
+  if (cost_vec == NULL)
+/* The cost_vec is NULL during transfrom.  */
+gcc_assert ((!loop_lens || !loop_masks));
 
   /* Targets with store-lane instructions must not require explicit
  realignment.  vect_supportable_dr_alignment always returns either
@@ -10577,8 +10582,13 @@ vectorizable_load (vec_info *vinfo,
? &LOOP_VINFO_LENS (loop_vinfo)
: NULL);
 
-  /* Shouldn't go with length-based approach if fully masked.  */
-  gcc_assert (!loop_lens || !loop_masks);
+  /* The vect_transform_stmt and vect_analyze_stmt will go here but there
+ are some difference here.  We cannot enable both the lens and masks
+ during transform but it is allowed during analysis.
+ Shouldn't go with length-based approach if fully masked.  */
+  if (cost_vec == NULL)
+/* The cost_vec is NULL during transfrom.  */
+gcc_assert ((!loop_lens || !loop_masks));
 
   /* Targets with store-lane instructions must not require explicit
  realignment.  vect_supportable_dr_alignment always returns either
-- 
2.34.1



[PATCH v3] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-11 Thread pan2 . li
From: Pan Li 

Update in v3:
* Add pre-defined __riscv_v_fixed_vlen when zvl.

Update in v2:
* Cleanup some unused code.
* Fix some typo of commit log.

Original log:

This patch would like to introduce one new gcc attribute for RVV.
This attribute is used to define fixed-length variants of one
existing sizeless RVV types.

This attribute is valid if and only if the mrvv-vector-bits=zvl, the only
one args should be the integer constant and its' value is terminated
by the LMUL and the vector register bits in zvl*b.  For example:

typedef vint32m2_t fixed_vint32m2_t __attribute__((riscv_rvv_vector_bits(128)));

The above type define is valid when -march=rv64gc_zve64d_zvl64b
(aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when
-march=rv64gcv_zvl128b similar to below.

"error: invalid RVV vector size '128', expected size is '256' based on
LMUL of type and '-mrvv-vector-bits=zvl'"

Meanwhile, a pre-define macro __riscv_v_fixed_vlen is introduced to
represent the fixed vlen in a RVV vector register.

For the vint*m*_t below operations are allowed.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.
* CMP: >, <, ==, !=, <=, >=
* ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, -

For the vfloat*m*_t below operations are allowed.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.
* CMP: >, <, ==, !=, <=, >=
* ALU: +, -, *, /, -

For the vbool*_t types only below operations are allowed except
the CMP and ALU. The CMP and ALU operations on vbool*_t is not
well defined currently.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.

For the vint*x*m*_t tuple types are not suppored in this patch
which is compatible with clang.

This patch passed the below testsuites.
* The riscv fully regression tests.

gcc/ChangeLog:

* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add pre-define
macro __riscv_v_fixed_vlen when zvl.
* config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute):
New static func to take care of the RVV types decorated by
the attributes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-13.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-14.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-15.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-16.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-17.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-c.cc   |   3 +
 gcc/config/riscv/riscv.cc |  87 +-
 .../riscv/rvv/base/riscv_rvv_vector_bits-1.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-10.c |  53 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-11.c |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits-12.c |  14 +++
 .../riscv/rvv/base/riscv_rvv_vector_bits-13.c |  10 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-14.c |  10 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-15.c |  10 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-16.c |  11 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-17.c |  10 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-2.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-3.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-4.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-5.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-6.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-7.c  |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits-8.c  |  75 
 .../riscv/rvv/base/riscv_rvv_vector_bits-9.c  |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits.h| 108 ++
 20 files changed, 653 insertions(+), 2 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c
 create mode 1006

[PATCH v1] RISC-V: Fix some code style issue(s) in riscv-c.cc [NFC]

2024-03-12 Thread pan2 . li
From: Pan Li 

Notice some code style issue(s) when add __riscv_v_fixed_vlen, includes:

* Meanless empty line.
* Line greater than 80 chars.
* Indent with 3 space(s).
* Argument unalignment.

gcc/ChangeLog:

* config/riscv/riscv-c.cc (riscv_ext_version_value): Fix
code style greater than 80 chars.
(riscv_cpu_cpp_builtins): Fix useless empty line, indent
with 3 space(s) and argument unalignment.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-c.cc | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 3755ec0b8ef..7029ba88186 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -37,7 +37,8 @@ along with GCC; see the file COPYING3.  If not see
 static int
 riscv_ext_version_value (unsigned major, unsigned minor)
 {
-  return (major * RISCV_MAJOR_VERSION_BASE) + (minor * 
RISCV_MINOR_VERSION_BASE);
+  return (major * RISCV_MAJOR_VERSION_BASE)
++ (minor * RISCV_MINOR_VERSION_BASE);
 }
 
 /* Implement TARGET_CPU_CPP_BUILTINS.  */
@@ -110,7 +111,6 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
 case CM_MEDANY:
   builtin_define ("__riscv_cmodel_medany");
   break;
-
 }
 
   if (riscv_user_wants_strict_align)
@@ -142,9 +142,9 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
 riscv_ext_version_value (0, 12));
 }
 
-   if (TARGET_XTHEADVECTOR)
- builtin_define_with_int_value ("__riscv_th_v_intrinsic",
-riscv_ext_version_value (0, 11));
+  if (TARGET_XTHEADVECTOR)
+builtin_define_with_int_value ("__riscv_th_v_intrinsic",
+  riscv_ext_version_value (0, 11));
 
   /* Define architecture extension test macros.  */
   builtin_define_with_int_value ("__riscv_arch_test", 1);
-- 
2.34.1



[PATCH v1] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))

2024-03-17 Thread pan2 . li
From: Pan Li 

This patch would like to fix one ICE for __attribute__((target("arch=+v"))
and likewise extension(s). Given we have sample code as below:

void __attribute__((target("arch=+v")))
test_2 (int *a, int *b, int *out, unsigned count)
{
  unsigned i;
  for (i = 0; i < count; i++)
   out[i] = a[i] + b[i];
}

It will have ICE when build with -march=rv64gc -O3.

test.c: In function ‘test_2’:
test.c:4:1: internal compiler error: Floating point exception
4 | {
  | ^
0x1a5891b crash_signal
.../__RISC-V_BUILD__/../gcc/toplev.cc:319
0x7f0a7884251f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x1f51ba4 riscv_hard_regno_nregs
.../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:8143
0x1967bb9 init_reg_modes_target()
.../__RISC-V_BUILD__/../gcc/reginfo.cc:471
0x13fc029 init_emit_regs()
.../__RISC-V_BUILD__/../gcc/emit-rtl.cc:6237
0x1a5b83d target_reinit()
.../__RISC-V_BUILD__/../gcc/toplev.cc:1936
0x35e374d save_target_globals()
.../__RISC-V_BUILD__/../gcc/target-globals.cc:92
0x35e381f save_target_globals_default_opts()
.../__RISC-V_BUILD__/../gcc/target-globals.cc:122
0x1f544cc riscv_save_restore_target_globals(tree_node*)
.../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:9138
0x1f55c36 riscv_set_current_function
...

There are two reasons for this ICE.
1. The implied extension(s) of v are not well handled and the
   TARGET_MIN_VLEN is 0 which is not reinitialized.  Then the
   size / TARGET_MIN_VLEN will have DivideByZero.
2. The machine modes of the vector types will be vary after
   the v extension is introduced.

This patch passed below testsuite:
1. The riscv fully regression test.

PR target/114352

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse_single_ext): Add implied, combine
and conflict check after parse single extension.
* config/riscv/riscv.cc (riscv_set_current_function):
Reini the machine mode before when set cur function.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr114352-1.c: New test.
* gcc.target/riscv/rvv/base/pr114352-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/common/config/riscv/riscv-common.cc   | 33 ---
 gcc/config/riscv/riscv.cc |  4 ++
 .../gcc.target/riscv/rvv/base/pr114352-1.c| 58 +++
 .../gcc.target/riscv/rvv/base/pr114352-2.c| 27 +
 4 files changed, 115 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-2.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 48efef40dfd..d32bf147eca 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1375,20 +1375,39 @@ riscv_subset_list::parse_single_multiletter_ext (const 
char *p,
 const char *
 riscv_subset_list::parse_single_ext (const char *p, bool exact_single_p)
 {
+  const char *end_of_ext;
+
   switch (p[0])
 {
 case 'x':
-  return parse_single_multiletter_ext (p, "x", "non-standard extension",
-  exact_single_p);
+  end_of_ext = parse_single_multiletter_ext (p, "x",
+"non-standard extension",
+exact_single_p);
+  break;
 case 'z':
-  return parse_single_multiletter_ext (p, "z", "sub-extension",
-  exact_single_p);
+  end_of_ext = parse_single_multiletter_ext (p, "z", "sub-extension",
+exact_single_p);
+  break;
 case 's':
-  return parse_single_multiletter_ext (p, "s", "supervisor extension",
-  exact_single_p);
+  end_of_ext = parse_single_multiletter_ext (p, "s", "supervisor 
extension",
+exact_single_p);
+  break;
 default:
-  return parse_single_std_ext (p, exact_single_p);
+  end_of_ext = parse_single_std_ext (p, exact_single_p);
+  break;
 }
+
+  /* Make sure the implied or combined extension is included after add
+ a new std extension to subset list.  For exmaple as below,
+
+ void __attribute__((target("arch=+v"))) func () with -march=rv64gc.
+
+ The implied zvl128b and zve64d of the std v should be included.  */
+  handle_implied_ext (p);
+  handle_combine_ext ();
+  check_conflict_ext ();
+
+  return end_of_ext;
 }
 
 /* Parsing arch string to subset list, return NULL if parsing failed.  */
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 680c4a728e9..89acb94af10 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -9474,6 +9474,10 @@ riscv_set_current_function (tree decl)
   cl_target_option_restore (&global_op

[PATCH v1] RISC-V: Bugfix function target attribute pollution

2024-03-19 Thread pan2 . li
From: Pan Li 

This patch depends on below ICE fix.

https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647915.html

The function target attribute should be on a per-function basis.
For example, we have 3 function as below:

void test_1 () {}

void __attribute__((target("arch=+v"))) test_2 () {}

void __attribute__((target("arch=+zfh"))) test_3 () {}

void test_4 () {}

The scope of the target attribute should not extend the function body.
Aka, test_3 cannot have the 'v' extension, as well as the test_4
cannot have both the 'v' and 'zfh' extension.

Unfortunately, for now the test_4 is able to leverage the 'v' and
the 'zfh' extension which is incorrect.  This patch would like to
fix the sticking attribute by introduce the commandline subset_list.
When parse_arch, we always clone from the cmdline_subset_list instead
of the current_subset_list.

Meanwhile, we correct the print information about arch like below.

.option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zbb1p0

The riscv_declare_function_name hook is always after the hook
riscv_process_target_attr.  Thus, we introduce one hash_map to record
the 1:1 mapping from fndel to its' subset_list in advance.  And later
the riscv_declare_function_name is able to get the right information
about the arch.

Below test are passed for this patch
* The riscv fully regression test.

PR target/114352

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (struct riscv_func_target_info):
New struct for func decl and target name.
(struct riscv_func_target_hasher): New hasher for hash table mapping
from the fn_decl to fn_target_name.
(riscv_func_decl_hash): New func to compute the hash for fn_decl.
(riscv_func_target_hasher::hash): New func to impl hash interface.
(riscv_func_target_hasher::equal): New func to impl equal interface.
(riscv_cmdline_subset_list): New static var for cmdline subset list.
(riscv_func_target_table_lazy_init): New func to lazy init the func
target hash table.
(riscv_func_target_get): New func to get target name from hash table.
(riscv_func_target_put): New func to put target name into hash table.
(riscv_func_target_remove_and_destory): New func to remove target
info from the hash table and destory it.
(riscv_parse_arch_string): Set the static var cmdline_subset_list.
* config/riscv/riscv-subset.h (riscv_cmdline_subset_list): New static
var for cmdline subset list.
(riscv_func_target_get): New func decl.
(riscv_func_target_put): Ditto.
(riscv_func_target_remove_and_destory): Ditto.
* config/riscv/riscv-target-attr.cc 
(riscv_target_attr_parser::parse_arch):
Take cmdline_subset_list instead of current_subset_list when clone.
(riscv_process_target_attr): Record the func target info to hash table.
(riscv_option_valid_attribute_p): Add new arg tree fndel.
* config/riscv/riscv.cc (riscv_declare_function_name): Consume the
func target info and print the arch message.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr114352-3.c: New test.

Signed-off-by: Pan Li 
---
 gcc/common/config/riscv/riscv-common.cc   | 105 +++-
 gcc/config/riscv/riscv-subset.h   |   4 +
 gcc/config/riscv/riscv-target-attr.cc |  18 ++-
 gcc/config/riscv/riscv.cc |   7 +-
 .../gcc.target/riscv/rvv/base/pr114352-3.c| 113 ++
 5 files changed, 240 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-3.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index d32bf147eca..76ec9bf846c 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -425,11 +425,108 @@ bool riscv_subset_list::parse_failed = false;
 
 static riscv_subset_list *current_subset_list = NULL;
 
+static riscv_subset_list *cmdline_subset_list = NULL;
+
+struct riscv_func_target_info
+{
+  tree fn_decl;
+  std::string fn_target_name;
+
+  riscv_func_target_info (const tree &decl, const std::string &target_name)
+: fn_decl (decl), fn_target_name (target_name)
+  {
+  }
+};
+
+struct riscv_func_target_hasher : nofree_ptr_hash
+{
+  typedef tree compare_type;
+
+  static hashval_t hash (value_type);
+  static bool equal (value_type, const compare_type &);
+};
+
+static hash_table *func_target_table = NULL;
+
+static inline hashval_t riscv_func_decl_hash (tree fn_decl)
+{
+  inchash::hash h;
+
+  h.add_ptr (fn_decl);
+
+  return h.end ();
+}
+
+inline hashval_t
+riscv_func_target_hasher::hash (value_type value)
+{
+  return riscv_func_decl_hash (value->fn_decl);
+}
+
+inline bool
+riscv_func_target_hasher::equal (value_type value, const compare_type &key)
+{
+  return value->fn_decl == key;
+}
+
 const riscv_subset_list *riscv_current_subset_list ()
 {
   return c

[PATCH v2] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))

2024-03-21 Thread pan2 . li
From: Pan Li 

This patch would like to fix one ICE for __attribute__((target("arch=+v"))
and likewise extension(s). Given we have sample code as below:

void __attribute__((target("arch=+v")))
test_2 (int *a, int *b, int *out, unsigned count)
{
  unsigned i;
  for (i = 0; i < count; i++)
   out[i] = a[i] + b[i];
}

It will have ICE when build with -march=rv64gc -O3.

test.c: In function ‘test_2’:
test.c:4:1: internal compiler error: Floating point exception
4 | {
  | ^
0x1a5891b crash_signal
.../__RISC-V_BUILD__/../gcc/toplev.cc:319
0x7f0a7884251f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x1f51ba4 riscv_hard_regno_nregs
.../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:8143
0x1967bb9 init_reg_modes_target()
.../__RISC-V_BUILD__/../gcc/reginfo.cc:471
0x13fc029 init_emit_regs()
.../__RISC-V_BUILD__/../gcc/emit-rtl.cc:6237
0x1a5b83d target_reinit()
.../__RISC-V_BUILD__/../gcc/toplev.cc:1936
0x35e374d save_target_globals()
.../__RISC-V_BUILD__/../gcc/target-globals.cc:92
0x35e381f save_target_globals_default_opts()
.../__RISC-V_BUILD__/../gcc/target-globals.cc:122
0x1f544cc riscv_save_restore_target_globals(tree_node*)
.../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:9138
0x1f55c36 riscv_set_current_function
...

There are two reasons for this ICE.
1. The implied extension(s) of v are not well handled and the
   TARGET_MIN_VLEN is 0 which is not reinitialized.  Then the
   size / TARGET_MIN_VLEN will have DivideByZero.
2. The machine modes of the vector types will be vary after
   the v extension is introduced.

This patch passed below testsuite:
1. The riscv fully regression test.

PR target/114352

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_subset_list::parse):
Replace implied, combine and check to func finalize.
(riscv_subset_list::finalize): New func impl to take care of
implied, combine ext and related checks.
* config/riscv/riscv-subset.h: Add func decl for finalize.
* config/riscv/riscv-target-attr.cc 
(riscv_target_attr_parser::parse_arch):
Finalize the ext before return succeed.
* config/riscv/riscv.cc (riscv_set_current_function): Reinit the
machine mode before when set cur function.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr114352-1.c: New test.
* gcc.target/riscv/rvv/base/pr114352-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/common/config/riscv/riscv-common.cc   | 31 ++
 gcc/config/riscv/riscv-subset.h   |  2 +
 gcc/config/riscv/riscv-target-attr.cc |  2 +
 gcc/config/riscv/riscv.cc |  4 ++
 .../gcc.target/riscv/rvv/base/pr114352-1.c| 58 +++
 .../gcc.target/riscv/rvv/base/pr114352-2.c| 27 +
 6 files changed, 114 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-2.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 440127a2af0..15d44245b3c 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1428,16 +1428,7 @@ riscv_subset_list::parse (const char *arch, location_t 
loc)
   if (p == NULL)
 goto fail;
 
-  for (itr = subset_list->m_head; itr != NULL; itr = itr->next)
-{
-  subset_list->handle_implied_ext (itr->name.c_str ());
-}
-
-  /* Make sure all implied extensions are included. */
-  gcc_assert (subset_list->check_implied_ext ());
-
-  subset_list->handle_combine_ext ();
-  subset_list->check_conflict_ext ();
+  subset_list->finalize ();
 
   return subset_list;
 
@@ -1467,6 +1458,26 @@ riscv_subset_list::set_loc (location_t loc)
   m_loc = loc;
 }
 
+/* Make sure the implied or combined extension is included after add
+   a new std extension to subset list or likewise.  For exmaple as below,
+
+   void __attribute__((target("arch=+v"))) func () with -march=rv64gc.
+
+   The implied zvl128b and zve64d of the std v should be included.  */
+void
+riscv_subset_list::finalize ()
+{
+  riscv_subset_t *subset;
+
+  for (subset = m_head; subset != NULL; subset = subset->next)
+handle_implied_ext (subset->name.c_str ());
+
+  gcc_assert (check_implied_ext ());
+
+  handle_combine_ext ();
+  check_conflict_ext ();
+}
+
 /* Return the current arch string.  */
 
 std::string
diff --git a/gcc/config/riscv/riscv-subset.h b/gcc/config/riscv/riscv-subset.h
index ae849e2a302..ec979040e8c 100644
--- a/gcc/config/riscv/riscv-subset.h
+++ b/gcc/config/riscv/riscv-subset.h
@@ -105,6 +105,8 @@ public:
   int match_score (riscv_subset_list *) const;
 
   void set_loc (location_t);
+
+  void finalize ();
 };
 
 extern const riscv_subset_list *riscv_current_subset_list (void);
diff --git a/gcc/config/riscv/riscv-target-attr.cc 
b/gcc/config/riscv/riscv-target-attr.cc
index 

[PATCH v4] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-21 Thread pan2 . li
From: Pan Li 

This patch would like to introduce one new gcc attribute for RVV.
This attribute is used to define fixed-length variants of one
existing sizeless RVV types.

This attribute is valid if and only if the mrvv-vector-bits=zvl, the only
one args should be the integer constant and its' value is terminated
by the LMUL and the vector register bits in zvl*b.  For example:

typedef vint32m2_t fixed_vint32m2_t __attribute__((riscv_rvv_vector_bits(128)));

The above type define is valid when -march=rv64gc_zve64d_zvl64b
(aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when
-march=rv64gcv_zvl128b similar to below.

"error: invalid RVV vector size '128', expected size is '256' based on
LMUL of type and '-mrvv-vector-bits=zvl'"

Meanwhile, a pre-define macro __riscv_v_fixed_vlen is introduced to
represent the fixed vlen in a RVV vector register.

For the vint*m*_t below operations are allowed.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.
* CMP: >, <, ==, !=, <=, >=
* ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, -

The CMP will return vint*m*_t the same as aarch64 sve. For example:
typedef vint32m1_t fixed_vint32m1_t __attribute__((riscv_rvv_vector_bits(128)));
fixed_vint32m1_t less_than (fixed_vint32m1_t a, fixed_vint32m1_t b)
{
  return a < b;
}

For the vfloat*m*_t below operations are allowed.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.
* CMP: >, <, ==, !=, <=, >=
* ALU: +, -, *, /, -

The CMP will return vfloat*m*_t the same as aarch64 sve. For example:
typedef vfloat32m1_t fixed_vfloat32m1_t 
__attribute__((riscv_rvv_vector_bits(128)));
fixed_vfloat32m1_t less_than (fixed_vfloat32m1_t a, fixed_vfloat32m1_t b)
{
  return a < b;
}

For the vbool*_t types only below operations are allowed except
the CMP and ALU. The CMP and ALU operations on vbool*_t is not
well defined currently.
* The sizeof.
* The global variable(s).
* The element of union and struct.
* The cast to other equalities.

For the vint*x*m*_t tuple types are not suppored in this patch which is
compatible with clang.

This patch passed the below testsuites.
* The riscv fully regression tests.

gcc/ChangeLog:

* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add pre-define
macro __riscv_v_fixed_vlen when zvl.
* config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute):
New static func to take care of the RVV types decorated by
the attributes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-13.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-14.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-15.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-16.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-17.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-18.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test.
* gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-c.cc   |   3 +
 gcc/config/riscv/riscv.cc |  87 +-
 .../riscv/rvv/base/riscv_rvv_vector_bits-1.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-10.c |  53 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-11.c |  76 
 .../riscv/rvv/base/riscv_rvv_vector_bits-12.c |  14 +++
 .../riscv/rvv/base/riscv_rvv_vector_bits-13.c |  10 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-14.c |  10 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-15.c |  10 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-16.c |  11 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-17.c |  10 ++
 .../riscv/rvv/base/riscv_rvv_vector_bits-18.c |  45 
 .../riscv/rvv/base/riscv_rvv_vector_bits-2.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-3.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-4.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-5.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-6.c  |   6 +
 .../riscv/rvv/base/riscv_rvv_vector_bits-7

[PATCH v1] RISC-V: Allow RVV intrinsic when function target("arch=+v")

2024-03-25 Thread pan2 . li
From: Pan Li 

This patch would like to allow the RVV intrinsic when function is
attributed as target("arch=+v") and build with rv64gc.  For example:

vint32m1_t
__attribute__((target("arch=+v")))
test_1 (vint32m1_t a, vint32m1_t b, size_t vl)
{
  return __riscv_vadd_vv_i32m1 (a, b, vl);
}

build with -march=rv64gc -mabi=lp64d -O3, we will have asm like below:
test_1:
  .option push
  .option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_\
zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0
  vsetvli zero,a0,e32,m1,ta,ma
  vadd.vv v8,v8,v9
  ret

The riscv_vector.h must be included when leverage intrinisc type(s) and
API(s).  And the scope of this attribute should not excced the function
body.  Meanwhile, to make rvv types and API(s) available for this attribute,
include riscv_vector.h will not report error for now if v is not present
in march.

Below test are passed for this patch:
* The riscv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-c.cc (riscv_pragma_intrinsic): Remove error
when V is disabled and init the RVV types and intrinic APIs.
* config/riscv/riscv-vector-builtins.cc (expand_builtin): Report
error if V ext is disabled.
* config/riscv/riscv.cc (riscv_return_value_is_vector_type_p):
Ditto.
(riscv_arguments_is_vector_type_p): Ditto.
(riscv_vector_cc_function_p): Ditto.
* config/riscv/riscv_vector.h: Remove error if V is disable.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pragma-1.c: Remove.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-1.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-2.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-4.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-5.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-6.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: New 
test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-c.cc   | 18 +++
 gcc/config/riscv/riscv-vector-builtins.cc |  5 
 gcc/config/riscv/riscv.cc | 30 ---
 gcc/config/riscv/riscv_vector.h   |  4 ---
 .../gcc.target/riscv/rvv/base/pragma-1.c  |  4 ---
 .../target_attribute_v_with_intrinsic-1.c |  5 
 .../target_attribute_v_with_intrinsic-2.c | 18 +++
 .../target_attribute_v_with_intrinsic-3.c | 13 
 .../target_attribute_v_with_intrinsic-4.c | 10 +++
 .../target_attribute_v_with_intrinsic-5.c | 12 
 .../target_attribute_v_with_intrinsic-6.c | 12 
 .../target_attribute_v_with_intrinsic-7.c |  9 ++
 .../target_attribute_v_with_intrinsic-8.c | 23 ++
 13 files changed, 145 insertions(+), 18 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pragma-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-6.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index edb866d51e4..01314037461 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -201,14 +201,20 @@ riscv_pragma_intrinsic (cpp_reader *)
   if (strcmp (name, "vector") == 0
   || strcmp (name, "xtheadvector") == 0)
 {
-  if (!TARGET_VECTOR)
+  if (TARGET_VECTOR)
+   riscv_vector::handle_pragma_vector ();
+  else /* Indicates riscv_vector.h is included but v is missing in arch  */
{
- error ("%<#pragma riscv intrinsic%> option %qs needs 'V' or "
-"'XTHEADVECTOR' extension enabled",
-name);
- return;
+ /* To make the the rvv types and intrinsic API available for the
+target("arch=+v") attribute,  we need to temporally enable the
+TARGET_VECTOR, and disable it after all initialized.  */
+ target_flags |= MASK_VECTOR;
+
+ riscv_vector::init_builtins ();

[PATCH v1] RISC-V: Allow RVV intrinsic for more function target

2024-03-26 Thread pan2 . li
From: Pan Li 

In previous, we allowed the target(("arch=+v")) for a function with
rv64gc build.  This patch would like to support more arch options as
below:
* zve32x
* zve32f
* zve64x
* zve64f
* zve64d
* zvfhmin
* zvfh

For example, we have sample code as below.
vfloat32m1_t
__attribute__((target("arch=+zve64f")))
test_9 (vfloat32m1_t a, vfloat32m1_t b, size_t vl)
{
  return __riscv_vfadd_vv_f32m1 (a, b, vl);
}

It will generate the asm code when build with -O3 -march=rv64gc
test_9:
vsetvli zero,a0,e32,m1,ta,ma
vfadd.vvv8,v8,v9
ret

Meanwhile, this patch introduces more error handling for the target
attribute.  Take arch=+zve32x with vfloat32m1_t will have error message
"'vfloat32m1_t' requires the zve32f, zve64f or zve64d ISA extension".
And take arch=+zve32f with vfloat16m1_t will have error message
"'vfloat16m1_t' requires the zvfhmin or zvfh ISA extension".

Below test are passed for this patch:
* The riscv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-c.cc (riscv_pragma_intrinsic): Add INT and
FP vector element flags, invoke override option and mode adjust.
* config/riscv/riscv-protos.h (riscv_option_override): New extern
func decl.
* config/riscv/riscv-vector-builtins.cc (expand_builtin): Return
target rtx after error_at.
* config/riscv/riscv.cc (riscv_vector_int_type_p): New predicate
func to tell one tree type is integer or not.
(riscv_vector_float_type_p): New predicate func to tell one tree
type is float or not.
(riscv_vector_element_bitsize): New func to get the element bitsize
of a vector tree type.
(riscv_validate_vector_type): New func to validate the tree type
is valid on flags.
(riscv_return_value_is_vector_type_p): Leverage the func
riscv_validate_vector_type to do the tree type validation.
(riscv_arguments_is_vector_type_p): Diito.
(riscv_override_options_internal): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-10.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-11.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-12.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-13.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-14.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-15.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-16.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-17.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-18.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-19.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-20.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-21.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-22.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-23.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-24.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-25.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-26.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-27.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-28.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-29.c: New 
test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-9.c: New 
test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-c.cc   |  30 +-
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-vector-builtins.cc |   7 +-
 gcc/config/riscv/riscv.cc | 101 --
 .../target_attribute_v_with_intrinsic-10.c|  12 +++
 .../target_attribute_v_with_intrinsic-11.c|  26 +
 .../target_attribute_v_with_intrinsic-12.c|  33 ++
 .../target_attribute_v_with_intrinsic-13.c|  33 ++
 .../target_attribute_v_with_intrinsic-14.c|  40 +++
 .../target_attribute_v_with_intrinsic-15.c|  47 
 .../target_attribute_v_with_intrinsic-16.c|  12 +++
 .../target_attribute_v_with_intrinsic-17.c|  13 +++
 .../target_attribute_v_with_intrinsic-18.c|  13 +++
 .../target_attribute_v_with_intrinsic-19.c|  13 +++
 .../target_attribute_v_with_intrinsic-20.c|  13 +++
 .../target_attribute_v_with_intrinsic-21.c|  13 +++
 .../target_attribute_v_with_intrinsic-22.c|  13 +++
 .../target_attribute_v_with_intrinsic-23.c|  13 +++
 .../target_attribute_v_with_intrinsic-24.c  

[PATCH] RISC-V: Fix misspelled term builtin in error message

2024-03-30 Thread pan2 . li
From: Pan Li 

This patch would like to fix below misspelled term in error message.

../../gcc/config/riscv/riscv-vector-builtins.cc:4592:16: error:
misspelled term 'builtin function' in format; use 'built-in function' instead 
[-Werror=format-diag]
 4592 |   "builtin function %qE requires the V ISA extension", exp);

The below tests are passed for this patch.
* The riscv regression test on rvv.exp and riscv.exp.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc (expand_builtin): Take
the term built-in over builtin.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c:
Adjust test dg-error.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c:
Ditto.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-vector-builtins.cc   | 2 +-
 .../riscv/rvv/base/target_attribute_v_with_intrinsic-7.c| 2 +-
 .../riscv/rvv/base/target_attribute_v_with_intrinsic-8.c| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index e07373d8b57..db9246eed2d 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4589,7 +4589,7 @@ expand_builtin (unsigned int code, tree exp, rtx target)
 
   if (!TARGET_VECTOR)
 error_at (EXPR_LOCATION (exp),
- "builtin function %qE requires the V ISA extension", exp);
+ "built-in function %qE requires the V ISA extension", exp);
 
   return function_expander (rfn.instance, rfn.decl, exp, target).expand ();
 }
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c
index 520b2e59fae..a4cd67f4f95 100644
--- 
a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c
@@ -5,5 +5,5 @@
 
 size_t test_1 (size_t vl)
 {
-  return __riscv_vsetvl_e8m4 (vl); /* { dg-error {builtin function 
'__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */
+  return __riscv_vsetvl_e8m4 (vl); /* { dg-error {built-in function 
'__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */
 }
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c
index 9032d9d0b43..06ed9a9eddc 100644
--- 
a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c
@@ -19,5 +19,5 @@ test_2 ()
 size_t
 test_3 (size_t vl)
 {
-  return __riscv_vsetvl_e8m4 (vl); /* { dg-error {builtin function 
'__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */
+  return __riscv_vsetvl_e8m4 (vl); /* { dg-error {built-in function 
'__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */
 }
-- 
2.34.1



[PATCH] RISC-V: Fix one unused varable in riscv_subset_list::parse

2024-03-30 Thread pan2 . li
From: Pan Li 

This patch would like to fix one unused variable as below:

../../gcc/common/config/riscv/riscv-common.cc: In static member function
'static riscv_subset_list* riscv_subset_list::parse(const char*, location_t)':
../../gcc/common/config/riscv/riscv-common.cc:1501:19: error: unused variable 
'itr'
  [-Werror=unused-variable]
 1501 |   riscv_subset_t *itr;

The variable consume code was removed but missed the var itself in
previous.  Thus, we have unused variable here.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_subset_list::parse):
Remove unused var decl.

Signed-off-by: Pan Li 
---
 gcc/common/config/riscv/riscv-common.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 7095f303cbb..43b7549e3ec 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1498,7 +1498,6 @@ riscv_subset_list::parse (const char *arch, location_t 
loc)
 return NULL;
 
   riscv_subset_list *subset_list = new riscv_subset_list (arch, loc);
-  riscv_subset_t *itr;
   const char *p = arch;
   p = subset_list->parse_base_ext (p);
   if (p == NULL)
-- 
2.34.1



[PATCH v1] Internal-fn: Introduce new internal function SAT_ADD

2024-04-06 Thread pan2 . li
From: Pan Li 

This patch would like to add the middle-end presentation for the
saturation add.  Aka set the result of add to the max when overflow.
It will take the pattern similar as below.

SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))

Take uint8_t as example, we will have:

* SAT_ADD (1, 254)   => 255.
* SAT_ADD (1, 255)   => 255.
* SAT_ADD (2, 255)   => 255.
* SAT_ADD (255, 255) => 255.

The patch also implement the SAT_ADD in the riscv backend as
the sample for both the scalar and vector.  Given below example:

uint64_t sat_add_u64 (uint64_t x, uint64_t y)
{
  return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
}

Before this patch:
uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  long unsigned int _1;
  _Bool _2;
  long unsigned int _3;
  long unsigned int _4;
  uint64_t _7;
  long unsigned int _10;
  __complex__ long unsigned int _11;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _11 = .ADD_OVERFLOW (x_5(D), y_6(D));
  _1 = REALPART_EXPR <_11>;
  _10 = IMAGPART_EXPR <_11>;
  _2 = _10 != 0;
  _3 = (long unsigned int) _2;
  _4 = -_3;
  _7 = _1 | _4;
  return _7;
;;succ:   EXIT

}

After this patch:
uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  uint64_t _7;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
  return _7;
;;succ:   EXIT
}

For vectorize, we leverage the existing vect pattern recog to find
the pattern similar to scalar and let the vectorizer to perform
the rest part for standard name usadd3 in vector mode.
The riscv vector backend have insn "Vector Single-Width Saturating
Add and Subtract" which can be leveraged when expand the usadd3
in vector mode.  For example:

void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  unsigned i;

  for (i = 0; i < n; i++)
out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i]));
}

Before this patch:
void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  ...
  _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]);
  ivtmp_58 = _80 * 8;
  vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0);
  vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0);
  vect__7.11_66 = vect__4.7_61 + vect__6.10_65;
  mask__8.12_67 = vect__4.7_61 > vect__7.11_66;
  vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, 
vect__7.11_66);
  .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72);
  vectp_x.5_60 = vectp_x.5_59 + ivtmp_58;
  vectp_y.8_64 = vectp_y.8_63 + ivtmp_58;
  vectp_out.16_75 = vectp_out.16_74 + ivtmp_58;
  ivtmp_79 = ivtmp_78 - _80;
  ...
}

vec_sat_add_u64:
  ...
  vsetvli a5,a3,e64,m1,ta,ma
  vle64.v v0,0(a1)
  vle64.v v1,0(a2)
  sllia4,a5,3
  sub a3,a3,a5
  add a1,a1,a4
  add a2,a2,a4
  vadd.vv v1,v0,v1
  vmsgtu.vv   v0,v0,v1
  vmerge.vim  v1,v1,-1,v0
  vse64.v v1,0(a0)
  ...

After this patch:
void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  ...
  _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]);
  ivtmp_46 = _62 * 8;
  vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0);
  vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0);
  vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53);
  .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54);
  ...
}

vec_sat_add_u64:
  ...
  vsetvli a5,a3,e64,m1,ta,ma
  vle64.v v1,0(a1)
  vle64.v v2,0(a2)
  sllia4,a5,3
  sub a3,a3,a5
  add a1,a1,a4
  add a2,a2,a4
  vsaddu.vv   v1,v1,v2
  vse64.v v1,0(a0)
  ...

To limit the patch size for review, only unsigned version of
usadd3 are involved here. The signed version will be covered
in the underlying patch(es).

The below test suites are passed for this patch.
* The riscv fully regression tests.
* The aarch64 fully regression tests.
* The x86 fully regression tests.

PR target/51492
PR target/112600

gcc/ChangeLog:

* config/riscv/autovec.md (usadd3): New pattern expand
for unsigned SAT_ADD vector.
* config/riscv/riscv-protos.h (riscv_expand_usadd): New func
decl to expand usadd3 pattern.
(expand_vec_usadd): Ditto but for vector.
* config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to
emit the vsadd insn.
(expand_vec_usadd): New func impl to expand usadd3 for
vector.
* config/riscv/riscv.cc (riscv_expand_usadd): New func impl
to expand usadd3 for scalar.
* config/riscv/riscv.md (usadd3): New pattern expand
for unsigned SAT_ADD scalar.
* config/riscv/vector.md: Allow VLS mode for vsaddu.
* internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD.
* internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD.
* match.pd: Add unsigned SAT_ADD match and simply.
* optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd.
   

[PATCH v2] Internal-fn: Introduce new internal function SAT_ADD

2024-04-07 Thread pan2 . li
From: Pan Li 

Update in v2:
* Fix one failure for x86 bootstrap.

Original log:

This patch would like to add the middle-end presentation for the
saturation add.  Aka set the result of add to the max when overflow.
It will take the pattern similar as below.

SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))

Take uint8_t as example, we will have:

* SAT_ADD (1, 254)   => 255.
* SAT_ADD (1, 255)   => 255.
* SAT_ADD (2, 255)   => 255.
* SAT_ADD (255, 255) => 255.

The patch also implement the SAT_ADD in the riscv backend as
the sample for both the scalar and vector.  Given below example:

uint64_t sat_add_u64 (uint64_t x, uint64_t y)
{
  return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
}

Before this patch:
uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  long unsigned int _1;
  _Bool _2;
  long unsigned int _3;
  long unsigned int _4;
  uint64_t _7;
  long unsigned int _10;
  __complex__ long unsigned int _11;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _11 = .ADD_OVERFLOW (x_5(D), y_6(D));
  _1 = REALPART_EXPR <_11>;
  _10 = IMAGPART_EXPR <_11>;
  _2 = _10 != 0;
  _3 = (long unsigned int) _2;
  _4 = -_3;
  _7 = _1 | _4;
  return _7;
;;succ:   EXIT

}

After this patch:
uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  uint64_t _7;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
  return _7;
;;succ:   EXIT
}

For vectorize, we leverage the existing vect pattern recog to find
the pattern similar to scalar and let the vectorizer to perform
the rest part for standard name usadd3 in vector mode.
The riscv vector backend have insn "Vector Single-Width Saturating
Add and Subtract" which can be leveraged when expand the usadd3
in vector mode.  For example:

void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  unsigned i;

  for (i = 0; i < n; i++)
out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i]));
}

Before this patch:
void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  ...
  _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]);
  ivtmp_58 = _80 * 8;
  vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0);
  vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0);
  vect__7.11_66 = vect__4.7_61 + vect__6.10_65;
  mask__8.12_67 = vect__4.7_61 > vect__7.11_66;
  vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, 
vect__7.11_66);
  .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72);
  vectp_x.5_60 = vectp_x.5_59 + ivtmp_58;
  vectp_y.8_64 = vectp_y.8_63 + ivtmp_58;
  vectp_out.16_75 = vectp_out.16_74 + ivtmp_58;
  ivtmp_79 = ivtmp_78 - _80;
  ...
}

vec_sat_add_u64:
  ...
  vsetvli a5,a3,e64,m1,ta,ma
  vle64.v v0,0(a1)
  vle64.v v1,0(a2)
  sllia4,a5,3
  sub a3,a3,a5
  add a1,a1,a4
  add a2,a2,a4
  vadd.vv v1,v0,v1
  vmsgtu.vv   v0,v0,v1
  vmerge.vim  v1,v1,-1,v0
  vse64.v v1,0(a0)
  ...

After this patch:
void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  ...
  _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]);
  ivtmp_46 = _62 * 8;
  vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0);
  vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0);
  vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53);
  .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54);
  ...
}

vec_sat_add_u64:
  ...
  vsetvli a5,a3,e64,m1,ta,ma
  vle64.v v1,0(a1)
  vle64.v v2,0(a2)
  sllia4,a5,3
  sub a3,a3,a5
  add a1,a1,a4
  add a2,a2,a4
  vsaddu.vv   v1,v1,v2
  vse64.v v1,0(a0)
  ...

To limit the patch size for review, only unsigned version of
usadd3 are involved here. The signed version will be covered
in the underlying patch(es).

The below test suites are passed for this patch.
* The riscv fully regression tests.
* The aarch64 fully regression tests.
* The x86 bootstrap tests.
* The x86 fully regression tests.

PR target/51492
PR target/112600

gcc/ChangeLog:

* config/riscv/autovec.md (usadd3): New pattern expand
for unsigned SAT_ADD vector.
* config/riscv/riscv-protos.h (riscv_expand_usadd): New func
decl to expand usadd3 pattern.
(expand_vec_usadd): Ditto but for vector.
* config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to
emit the vsadd insn.
(expand_vec_usadd): New func impl to expand usadd3 for
vector.
* config/riscv/riscv.cc (riscv_expand_usadd): New func impl
to expand usadd3 for scalar.
* config/riscv/riscv.md (usadd3): New pattern expand
for unsigned SAT_ADD scalar.
* config/riscv/vector.md: Allow VLS mode for vsaddu.
* internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD.
* internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD.
* match.pd: Add unsigned SAT_ADD matc

[PATCH v1] RISC-V: Refine the error msg for RVV intrinisc required ext

2024-04-08 Thread pan2 . li
From: Pan Li 

The RVV intrinisc API has sorts of required extension from both
the march or target attribute.  It will have error message similar
to below:

built-in function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension

However, it is not accurate as we have many additional sub extenstion
besides v extension.  For example, zvbb, zvbk, zvbc ... etc.  This patch
would like to refine the error message with a friendly hint for the
required extension.  For example as below:

vuint64m1_t
__attribute__((target("arch=+v")))
test_1 (vuint64m1_t op_1, vuint64m1_t op_2, size_t vl)
{
  return __riscv_vclmul_vv_u64m1 (op_1, op_2, vl);
}

When compile with march=rv64gc and target arch=+v, we will have error
message as below:

error: built-in function '__riscv_vclmul_vv_u64m1(op_1,  op_2,  vl)'
  requires the 'zvbc' ISA extension

Then the end-user will get the point that the *zvbc* extension is missing
for the intrinisc API easily.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-shapes.cc (build_one): Pass
required_ext arg when invoke add function.
(build_th_loadstore): Ditto.
(struct vcreate_def): Ditto.
(struct read_vl_def): Ditto.
(struct vlenb_def): Ditto.
* config/riscv/riscv-vector-builtins.cc 
(function_builder::add_function):
Introduce new arg required_ext to fill in the register func.
(function_builder::add_unique_function): Ditto.
(function_builder::add_overloaded_function): Ditto.
(expand_builtin): Leverage required_extensions_specified to
check if the required extension is provided.
* config/riscv/riscv-vector-builtins.h (reqired_ext_to_isa_name): New
func impl to convert the required_ext enum to the extension name.
(required_extensions_specified): New func impl to predicate if
the required extension is well feeded.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: 
Adjust
the error message for v extension.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: 
Ditto.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-1.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-10.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-2.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-3.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-4.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-5.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-6.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-7.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-8.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-9.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/riscv-vector-builtins-shapes.cc | 18 +++--
 gcc/config/riscv/riscv-vector-builtins.cc | 23 --
 gcc/config/riscv/riscv-vector-builtins.h  | 75 ++-
 .../riscv/rvv/base/intrinsic_required_ext-1.c | 10 +++
 .../rvv/base/intrinsic_required_ext-10.c  | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-2.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-3.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-4.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-5.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-6.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-7.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-8.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-9.c | 11 +++
 .../target_attribute_v_with_intrinsic-7.c |  2 +-
 .../target_attribute_v_with_intrinsic-8.c |  2 +-
 15 files changed, 210 insertions(+), 19 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-6.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-9.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index c5ffcc1f2c4..7f983e82370 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -72,9 +72,10 @@ build_one (function_builder

[PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch

2024-04-10 Thread pan2 . li
From: Pan Li 

This patch would like to fix a ICE in mode sw for below example code.

during RTL pass: mode_sw
test.c: In function ‘vbool16_t j(vuint64m4_t)’:
test.c:15:1: internal compiler error: in create_pre_exit, at
mode-switching.cc:451
   15 | }
  | ^
0x3978f12 create_pre_exit
__RISCV_BUILD__/../gcc/mode-switching.cc:451
0x3979e9e optimize_mode_switching
__RISCV_BUILD__/../gcc/mode-switching.cc:849
0x397b9bc execute
__RISCV_BUILD__/../gcc/mode-switching.cc:1324

extern size_t get_vl ();

vbool16_t
test (vuint64m4_t a)
{
  unsigned long b;
  return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ());
}

The create_pre_exit would like to find a return value copy.  If
not, there will be a reason in assert but not available for above
sample code when vector calling convension is enabled by default.
This patch would like to override the TARGET_FUNCTION_VALUE_REGNO_P
for vector register and then we will have hard_regno_nregs for copy_num,
aka there is a return value copy.

As a side-effect of allow vector in TARGET_FUNCTION_VALUE_REGNO_P, the
TARGET_GET_RAW_RESULT_MODE will have vector mode and which is sizeless
cannot be converted to fixed_size_mode.  Thus override the hook
TARGET_GET_RAW_RESULT_MODE and return VOIDmode when the regno is-not-a
fixed_size_mode.

The below tests are passed for this patch.
* The fully riscv regression tests.
* The reproducing test in bugzilla PR114639.

PR target/114639

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_function_value_regno_p): New func
impl for hook TARGET_FUNCTION_VALUE_REGNO_P.
(riscv_get_raw_result_mode): New func imple for hook
TARGET_GET_RAW_RESULT_MODE.
(TARGET_FUNCTION_VALUE_REGNO_P): Impl the hook.
(TARGET_GET_RAW_RESULT_MODE): Ditto.
* config/riscv/riscv.h (V_RETURN): New macro for vector return.
(GP_RETURN_FIRST): New macro for the first GPR in return.
(GP_RETURN_LAST): New macro for the last GPR in return.
(FP_RETURN_FIRST): Diito but for FPR.
(FP_RETURN_LAST): Ditto.
(FUNCTION_VALUE_REGNO_P): Remove as deprecated and replace by
TARGET_FUNCTION_VALUE_REGNO_P.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/pr114639-1.C: New test.
* gcc.target/riscv/rvv/base/pr114639-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 34 +++
 gcc/config/riscv/riscv.h  |  8 +++--
 .../g++.target/riscv/rvv/base/pr114639-1.C| 25 ++
 .../gcc.target/riscv/rvv/base/pr114639-1.c| 14 
 4 files changed, 79 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/pr114639-1.C
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114639-1.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 00defa69fd8..91f017dd52a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -10997,6 +10997,34 @@ riscv_vector_mode_supported_any_target_p (machine_mode)
   return true;
 }
 
+/* Implements hook TARGET_FUNCTION_VALUE_REGNO_P.  */
+
+static bool
+riscv_function_value_regno_p (const unsigned regno)
+{
+  if (GP_RETURN_FIRST <= regno && regno <= GP_RETURN_LAST)
+return true;
+
+  if (FP_RETURN_FIRST <= regno && regno <= FP_RETURN_LAST)
+return true;
+
+  if (regno == V_RETURN)
+return true;
+
+  return false;
+}
+
+/* Implements hook TARGET_GET_RAW_RESULT_MODE.  */
+
+static fixed_size_mode
+riscv_get_raw_result_mode (int regno)
+{
+  if (!is_a  (reg_raw_mode[regno]))
+return as_a  (VOIDmode);
+
+  return default_get_reg_raw_mode (regno);
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -11343,6 +11371,12 @@ riscv_vector_mode_supported_any_target_p (machine_mode)
 #undef TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P
 #define TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P 
riscv_vector_mode_supported_any_target_p
 
+#undef TARGET_FUNCTION_VALUE_REGNO_P
+#define TARGET_FUNCTION_VALUE_REGNO_P riscv_function_value_regno_p
+
+#undef TARGET_GET_RAW_RESULT_MODE
+#define TARGET_GET_RAW_RESULT_MODE riscv_get_raw_result_mode
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-riscv.h"
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 269b8c1f076..7797e67317a 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -683,6 +683,12 @@ enum reg_class
 
 #define GP_RETURN GP_ARG_FIRST
 #define FP_RETURN (UNITS_PER_FP_ARG == 0 ? GP_RETURN : FP_ARG_FIRST)
+#define V_RETURN  V_REG_FIRST
+
+#define GP_RETURN_FIRST GP_ARG_FIRST
+#define GP_RETURN_LAST  GP_ARG_FIRST + 1
+#define FP_RETURN_FIRST FP_RETURN
+#define FP_RETURN_LAST  FP_RETURN + 1
 
 #define MAX_ARGS_IN_REGISTERS \
   (riscv_abi == ABI_ILP32E || riscv_abi == ABI_LP64E \
@@ -714,8 +720,6 @@ enum reg_class
 #define FUNCTION_VALUE(VALTYPE, FUNC) \
   riscv_function_value (VALTY

[PATCH v1] RISC-V: Remove -Wno-psabi for test build option [NFC]

2024-04-10 Thread pan2 . li
From: Pan Li 

Just notice there are some test case still have -Wno-psabi option,
which is deprecated now.  Remove them all for riscv test cases.

The below test are passed for this patch.
* The riscv rvv regression test.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/pr109244.C: Remove deprecated
-Wno-psabi option.
* g++.target/riscv/rvv/base/pr109535.C: Ditto.
* gcc.target/riscv/rvv/autovec/fixed-vlmax-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: Ditto.

Signed-off-by: Pan Li 
---
 gcc/testsuite/g++.target/riscv/rvv/base/pr109244.C  | 2 +-
 gcc/testsuite/g++.target/riscv/rvv/base/pr109535.C  | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/fixed-vlmax-1.c  | 2 +-
 .../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-1.c | 2 +-
 .../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-2.

[PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty args

2024-02-06 Thread pan2 . li
From: Pan Li 

There is one corn case when similar as below example:

void test (void)
{
  __riscv_vfredosum_tu ();
}

It will meet ICE because of the implement details of overloaded function
in gcc.  According to the rvv intrinisc doc, we have no such overloaded
function with empty args.  Unfortunately, we register the empty args
function as overloaded for avoiding conflict.  Thus, there will be actual
one register function after return NULL_TREE back to the middle-end,
and finally result in ICE when expanding.  For example:

1. First we registered void __riscv_vfredmax () as the overloaded function.
2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
3. The functions register in step 1 bypass the args check as empty args.
4. Finally, fall into expand_builtin with empty args and meet ICE.

Here we report error when overloaded function with empty args.  For example:

test.c: In function 'foo':
test.c:8:3: error: no matching function call to '__riscv_vfredosum_tu' with 
empty args
8 |   __riscv_vfredosum_tu();
  |   ^~~~

Below test are passed for this patch.

* The riscv regression tests.

PR target/113766

gcc/ChangeLog:

* config/riscv/riscv-protos.h (resolve_overloaded_builtin): Adjust
the signature of func.
* config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): Ditto.
* config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin): 
Make
overloaded func with empty args error.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr113766-1.c: New test.
* gcc.target/riscv/rvv/base/pr113766-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-c.cc   |  3 +-
 gcc/config/riscv/riscv-protos.h   |  2 +-
 gcc/config/riscv/riscv-vector-builtins.cc | 23 -
 .../gcc.target/riscv/rvv/base/pr113766-1.c| 85 +++
 .../gcc.target/riscv/rvv/base/pr113766-2.c| 48 +++
 5 files changed, 155 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-2.c

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 2e306057347..94c3871c760 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -250,7 +250,8 @@ riscv_resolve_overloaded_builtin (unsigned int 
uncast_location, tree fndecl,
 case RISCV_BUILTIN_GENERAL:
   break;
 case RISCV_BUILTIN_VECTOR:
-  new_fndecl = riscv_vector::resolve_overloaded_builtin (subcode, arglist);
+  new_fndecl = riscv_vector::resolve_overloaded_builtin (loc, subcode,
+fndecl, arglist);
   break;
 default:
   gcc_unreachable ();
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b3f0bdb9924..ae1685850ac 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -560,7 +560,7 @@ gimple *gimple_fold_builtin (unsigned int, 
gimple_stmt_iterator *, gcall *);
 rtx expand_builtin (unsigned int, tree, rtx);
 bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
-tree resolve_overloaded_builtin (unsigned int, vec *);
+tree resolve_overloaded_builtin (location_t, unsigned int, tree, vec *);
 bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
 bool legitimize_move (rtx, rtx *);
 void emit_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 403e1021fd1..efcdc8f1767 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4606,7 +4606,8 @@ check_builtin_call (location_t location, vec, 
unsigned int code,
 }
 
 tree
-resolve_overloaded_builtin (unsigned int code, vec *arglist)
+resolve_overloaded_builtin (location_t loc, unsigned int code, tree fndecl,
+   vec *arglist)
 {
   if (code >= vec_safe_length (registered_functions))
 return NULL_TREE;
@@ -4616,12 +4617,26 @@ resolve_overloaded_builtin (unsigned int code, 
vec *arglist)
   if (!rfun || !rfun->overloaded_p)
 return NULL_TREE;
 
+  /* According to the rvv intrinisc doc, we have no such overloaded function
+ with empty args.  Unfortunately, we register the empty args function as
+ overloaded for avoiding conflict.  Thus, there will actual one register
+ function after return NULL_TREE back to the middle-end, and finally result
+ in ICE when expanding.  For example:
+
+ 1. First we registered void __riscv_vfredmax () as the overloaded 
function.
+ 2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
+ 3. The functions register in step 1 bypass the args check as empty args.
+ 4. Finally, fall into expand_builtin with empty args and meet ICE.
+
+ Here we report error whe

[PATCH v1] RISC-V: Bugfix for RVV overloaded intrinsic ICE in function checker

2024-02-07 Thread pan2 . li
From: Pan Li 

There is another corn case when similar as below example:

void test (void)
{
  __riscv_vaadd ();
}

We report error when overloaded function with empty args.  For example:

test.c: In function 'foo':
test.c:8:3: error: no matching function call to '__riscv_vaadd' with empty args
8 |   __riscv_vaadd ();
  |   ^~~~

Unfortunately, it will meet another ICE similar to below after above
message.  The underlying build function checker will have zero args
and break some assumption of the function checker.  For example, the
count of args is not less than 2.

ice.c: In function ‘foo’:
ice.c:8:3: internal compiler error: in require_immediate, at
config/riscv/riscv-vector-builtins.cc:4252
8 |   __riscv_vaadd ();
  |   ^
0x20b36ac riscv_vector::function_checker::require_immediate(unsigned
int, long, long) const
.../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins.cc:4252
0x20b890c riscv_vector::alu_def::check(riscv_vector::function_checker&) const

.../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins-shapes.cc:387
0x20b38d7 riscv_vector::function_checker::check()
.../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins.cc:4315
0x20b4876 riscv_vector::check_builtin_call(unsigned int, vec,
.../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins.cc:4605
0x2069393 riscv_check_builtin_call
.../__RISC-V_BUILD__/../gcc/config/riscv/riscv-c.cc:227

Below test are passed for this patch.

* The riscv regression tests.

PR target/113766

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-shapes.cc (struct alu_def): Make
sure the c.arg_num is >= 2 before checking.
(struct build_frm_base): Ditto.
(struct narrow_alu_def): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr113766-1.c: Add new cases.

Signed-off-by: Pan Li 
---
 .../riscv/riscv-vector-builtins-shapes.cc   | 17 +
 .../gcc.target/riscv/rvv/base/pr113766-1.c  | 16 
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 8e90b17a94b..c5ffcc1f2c4 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -383,7 +383,10 @@ struct alu_def : public build_base
 /* Check whether rounding mode argument is a valid immediate.  */
 if (c.base->has_rounding_mode_operand_p ())
   {
-   if (!c.any_type_float_p ())
+   /* Some invalid overload intrinsic like below will have zero for
+  c.arg_num ().  Thus, make sure arg_num is big enough here.
+  __riscv_vaadd () will make c.arg_num () == 0.  */
+   if (!c.any_type_float_p () && c.arg_num () >= 2)
  return c.require_immediate (c.arg_num () - 2, VXRM_RNU, VXRM_ROD);
/* TODO: We will support floating-point intrinsic modeling
   rounding mode in the future.  */
@@ -411,8 +414,11 @@ struct build_frm_base : public build_base
   {
 gcc_assert (c.any_type_float_p ());
 
-/* Check whether rounding mode argument is a valid immediate.  */
-if (c.base->has_rounding_mode_operand_p ())
+/* Check whether rounding mode argument is a valid immediate.
+   Some invalid overload intrinsic like below will have zero for
+   c.arg_num ().  Thus, make sure arg_num is big enough here.
+   __riscv_vaadd () will make c.arg_num () == 0.  */
+if (c.base->has_rounding_mode_operand_p () && c.arg_num () >= 2)
   {
unsigned int frm_num = c.arg_num () - 2;
 
@@ -679,7 +685,10 @@ struct narrow_alu_def : public build_base
 /* Check whether rounding mode argument is a valid immediate.  */
 if (c.base->has_rounding_mode_operand_p ())
   {
-   if (!c.any_type_float_p ())
+   /* Some invalid overload intrinsic like below will have zero for
+  c.arg_num ().  Thus, make sure arg_num is big enough here.
+  __riscv_vaadd () will make c.arg_num () == 0.  */
+   if (!c.any_type_float_p () && c.arg_num () >= 2)
  return c.require_immediate (c.arg_num () - 2, VXRM_RNU, VXRM_ROD);
/* TODO: We will support floating-point intrinsic modeling
   rounding mode in the future.  */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
index bd4943b0b7e..fd674a8895c 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
@@ -82,4 +82,20 @@ test ()
 
   __riscv_vfredosum (); /* { dg-error {no matching function call to 
'__riscv_vfredosum' with empty args} } */
   __riscv_vfredosum_tu ();  /* { dg-error {no matching function call to 
'__riscv_vfredosum_tu' with empty args} } */
+
+  __riscv_vaadd (); /* { dg-error {no matching function call to 
'__riscv_vaadd' 

[PATCH v1] RISC-V: Fix misspelled term args in error_at message

2024-02-10 Thread pan2 . li
From: Pan Li 

When build with "-Werror=format-diag", there will be one misspelled
term args as below. This patch would like fix it by taking the term
arguments instead.

../../gcc/config/riscv/riscv-vector-builtins.cc: In function 'tree_node*
riscv_vector::resolve_overloaded_builtin(location_t, unsigned int, tree,
vec*)':
../../gcc/config/riscv/riscv-vector-builtins.cc:4633:65: error:
misspelled term 'args' in format; use 'arguments' instead
[-Werror=format-diag]
 4633 | error_at (loc, "no matching function call to %qE with empty
  args", fndecl);

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin):
Replace args to arguments for misspelled term.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr113766-1.c: Adjust the test cases.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-vector-builtins.cc |   3 +-
 .../gcc.target/riscv/rvv/base/pr113766-1.c| 126 +-
 2 files changed, 65 insertions(+), 64 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index efcdc8f1767..c5881a501d1 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4630,7 +4630,8 @@ resolve_overloaded_builtin (location_t loc, unsigned int 
code, tree fndecl,
 
  Here we report error when overloaded function with empty args.  */
   if (rfun->overloaded_p && arglist->length () == 0)
-error_at (loc, "no matching function call to %qE with empty args", fndecl);
+error_at (loc, "no matching function call to %qE with empty arguments",
+ fndecl);
 
   hashval_t hash = rfun->overloaded_hash (*arglist);
   registered_function *rfn
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
index fd674a8895c..9e911e31117 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
@@ -6,96 +6,96 @@
 void
 test ()
 {
-  __riscv_vand ();  /* { dg-error {no matching function call to 
'__riscv_vand' with empty args} } */
-  __riscv_vand_tu ();   /* { dg-error {no matching function call to 
'__riscv_vand_tu' with empty args} } */
-  __riscv_vand_tumu (); /* { dg-error {no matching function call to 
'__riscv_vand_tumu' with empty args} } */
+  __riscv_vand ();  /* { dg-error {no matching function call to 
'__riscv_vand' with empty arguments} } */
+  __riscv_vand_tu ();   /* { dg-error {no matching function call to 
'__riscv_vand_tu' with empty arguments} } */
+  __riscv_vand_tumu (); /* { dg-error {no matching function call to 
'__riscv_vand_tumu' with empty arguments} } */
 
-  __riscv_vcompress (); /* { dg-error {no matching function call to 
'__riscv_vcompress' with empty args} } */
-  __riscv_vcompress_tu ();  /* { dg-error {no matching function call to 
'__riscv_vcompress_tu' with empty args} } */
+  __riscv_vcompress (); /* { dg-error {no matching function call to 
'__riscv_vcompress' with empty arguments} } */
+  __riscv_vcompress_tu ();  /* { dg-error {no matching function call to 
'__riscv_vcompress_tu' with empty arguments} } */
 
-  __riscv_vcpop (); /* { dg-error {no matching function call to 
'__riscv_vcpop' with empty args} } */
+  __riscv_vcpop (); /* { dg-error {no matching function call to 
'__riscv_vcpop' with empty arguments} } */
 
-  __riscv_vdiv ();  /* { dg-error {no matching function call to 
'__riscv_vdiv' with empty args} } */
-  __riscv_vdiv_tu ();   /* { dg-error {no matching function call to 
'__riscv_vdiv_tu' with empty args} } */
-  __riscv_vdiv_tumu (); /* { dg-error {no matching function call to 
'__riscv_vdiv_tumu' with empty args} } */
+  __riscv_vdiv ();  /* { dg-error {no matching function call to 
'__riscv_vdiv' with empty arguments} } */
+  __riscv_vdiv_tu ();   /* { dg-error {no matching function call to 
'__riscv_vdiv_tu' with empty arguments} } */
+  __riscv_vdiv_tumu (); /* { dg-error {no matching function call to 
'__riscv_vdiv_tumu' with empty arguments} } */
 
-  __riscv_vfabs (); /* { dg-error {no matching function call to 
'__riscv_vfabs' with empty args} } */
-  __riscv_vfabs_tu ();  /* { dg-error {no matching function call to 
'__riscv_vfabs_tu' with empty args} } */
-  __riscv_vfabs_tumu ();/* { dg-error {no matching function call to 
'__riscv_vfabs_tumu' with empty args} } */
+  __riscv_vfabs (); /* { dg-error {no matching function call to 
'__riscv_vfabs' with empty arguments} } */
+  __riscv_vfabs_tu ();  /* { dg-error {no matching function call to 
'__riscv_vfabs_tu' with empty arguments} } */
+  __riscv_vfabs_tumu ();/* { dg-error {no matching function call to 
'__riscv_vfabs_tumu' with empty arguments} } */
 
-  __riscv_vfadd ();

[PATCH v1] Internal-fn: Add new internal function SAT_ADDU

2024-02-17 Thread pan2 . li
From: Pan Li 

This patch would like to add the middle-end presentation for the
unsigned saturation add.  Aka set the result of add to the max
when overflow.  It will take the pattern similar as below.

SAT_ADDU (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))

Take uint8_t as example, we will have:

* SAT_ADDU (1, 254)   => 255.
* SAT_ADDU (1, 255)   => 255.
* SAT_ADDU (2, 255)   => 255.
* SAT_ADDU (255, 255) => 255.

The patch also implement the SAT_ADDU in the riscv backend as
the sample.  Given below example:

uint64_t sat_add_u64 (uint64_t x, uint64_t y)
{
  return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
}

Before this patch:

uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  long unsigned int _1;
  _Bool _2;
  long unsigned int _3;
  long unsigned int _4;
  uint64_t _7;
  long unsigned int _10;
  __complex__ long unsigned int _11;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _11 = .ADD_OVERFLOW (x_5(D), y_6(D));
  _1 = REALPART_EXPR <_11>;
  _10 = IMAGPART_EXPR <_11>;
  _2 = _10 != 0;
  _3 = (long unsigned int) _2;
  _4 = -_3;
  _7 = _1 | _4;
  return _7;
;;succ:   EXIT

}

After this patch:

uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  uint64_t _7;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _7 = .SAT_ADDU (x_5(D), y_6(D)); [tail call]
  return _7;
;;succ:   EXIT

}

Then we will have the middle-end representation like .SAT_ADDU after
this patch.

PR target/51492
PR target/112600

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_expand_saturation_addu):
New func decl for the SAT_ADDU expand.
* config/riscv/riscv.cc (riscv_expand_saturation_addu): New func
impl for the SAT_ADDU expand.
* config/riscv/riscv.md (sat_addu_3): New pattern to impl
the standard name SAT_ADDU.
* doc/md.texi: Add doc for SAT_ADDU.
* internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADDU.
* internal-fn.def (SAT_ADDU): Add SAT_ADDU.
* match.pd: Add simplify pattern patch for SAT_ADDU.
* optabs.def (OPTAB_D): Add sat_addu_optab.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_addu-1.c: New test.
* gcc.target/riscv/sat_addu-2.c: New test.
* gcc.target/riscv/sat_addu-3.c: New test.
* gcc.target/riscv/sat_addu-4.c: New test.
* gcc.target/riscv/sat_addu-run-1.c: New test.
* gcc.target/riscv/sat_addu-run-2.c: New test.
* gcc.target/riscv/sat_addu-run-3.c: New test.
* gcc.target/riscv/sat_addu-run-4.c: New test.
* gcc.target/riscv/sat_arith.h: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-protos.h   |  1 +
 gcc/config/riscv/riscv.cc | 46 +
 gcc/config/riscv/riscv.md | 11 +
 gcc/doc/md.texi   | 11 +
 gcc/internal-fn.cc|  1 +
 gcc/internal-fn.def   |  1 +
 gcc/match.pd  | 22 +
 gcc/optabs.def|  2 +
 gcc/testsuite/gcc.target/riscv/sat_addu-1.c   | 18 +++
 gcc/testsuite/gcc.target/riscv/sat_addu-2.c   | 20 
 gcc/testsuite/gcc.target/riscv/sat_addu-3.c   | 17 +++
 gcc/testsuite/gcc.target/riscv/sat_addu-4.c   | 16 ++
 .../gcc.target/riscv/sat_addu-run-1.c | 42 
 .../gcc.target/riscv/sat_addu-run-2.c | 42 
 .../gcc.target/riscv/sat_addu-run-3.c | 42 
 .../gcc.target/riscv/sat_addu-run-4.c | 49 +++
 gcc/testsuite/gcc.target/riscv/sat_arith.h| 15 ++
 17 files changed, 356 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sat_arith.h

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ae1685850ac..f201b2384f9 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -132,6 +132,7 @@ extern void riscv_asm_output_external (FILE *, const tree, 
const char *);
 extern bool
 riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
 extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
+extern void riscv_expand_saturation_addu (rtx, rtx, rtx);
 
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool 
*invert_ptr = 0);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 

[PATCH v1] RISC-V: Upgrade RVV intrinsic version to 0.12

2024-02-20 Thread pan2 . li
From: Pan Li 

Upgrade the version of RVV intrinsic from 0.11 to 0.12.

PR target/114017

gcc/ChangeLog:

* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Upgrade
the version to 0.12.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/predef-__riscv_v_intrinsic.c: Update the
version to 0.12.
* gcc.target/riscv/rvv/base/pr114017-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-c.cc   |  2 +-
 .../riscv/predef-__riscv_v_intrinsic.c|  2 +-
 .../gcc.target/riscv/rvv/base/pr114017-1.c| 19 +++
 3 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 3ef06dcfd2d..3755ec0b8ef 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -139,7 +139,7 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
 {
   builtin_define ("__riscv_vector");
   builtin_define_with_int_value ("__riscv_v_intrinsic",
-riscv_ext_version_value (0, 11));
+riscv_ext_version_value (0, 12));
 }
 
if (TARGET_XTHEADVECTOR)
diff --git a/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c 
b/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c
index dbbedf54f87..07f1f159a8f 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c
@@ -3,7 +3,7 @@
 
 int main () {
 
-#if __riscv_v_intrinsic != 11000
+#if __riscv_v_intrinsic != 12000
 #error "__riscv_v_intrinsic"
 #endif
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c
new file mode 100644
index 000..8eee7c68f71
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+vuint8mf2_t
+test (vuint16m1_t val, size_t shift, size_t vl)
+{
+#if __riscv_v_intrinsic == 11000
+  #warning "RVV Intrinsics v0.11"
+  return __riscv_vnclipu (val, shift, vl);
+#endif
+
+#if __riscv_v_intrinsic == 12000
+  #warning "RVV Intrinsics v0.12" /* { dg-warning "RVV Intrinsics v0.12" } */
+  return __riscv_vnclipu (val, shift, 0, vl);
+#endif
+}
+
-- 
2.34.1



[PATCH v1] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-23 Thread pan2 . li
From: Pan Li 

This patch would like to introduce one new gcc option for RVV. To
appoint the bits size of one RVV vector register. Valid arguments to
'-mrvv-vector-bits=' are:

* 64
* 128
* 256
* 512
* 1024
* 2048
* 4096
* 8192
* 16384
* 32768
* 65536
* scalable
* zvl

1. The scalable will be the default values which take min_vlen for
   the riscv_vector_chunks.
2. The zvl will pick up the zvl*b from the march option. For example,
   the mrvv-vector-bits will be 1024 when march=rv64gcv_zvl1024b.
3. Otherwise, it will take the value provide and complain error if none
   of above valid value is given.

This option may influence the code gen when auto-vector. For example,

void test_rvv_vector_bits (int *a, int *b, int *out)
{
  for (int i = 0; i < 8; i++)
out[i] = a[i] + b[i];
}

It will generate code similar to below when build with
  -march=rv64gcv_zvl128b -mabi=lp64 -mrvv-vector-bits=zvl

test_rvv_vector_bits:
  ...
  vsetivli  zero,4,e32,m1,ta,ma
  vle32.v   v1,0(a0)
  vle32.v   v2,0(a1)
  vadd.vv   v1,v1,v2
  vse32.v   v1,0(a2)
  ...
  vle32.v   v1,0(a0)
  vle32.v   v2,0(a1)
  vadd.vv   v1,v1,v2
  vse32.v   v1,0(a2)

And it will become more simply similar to below when build with
  -march=rv64gcv_zvl128b -mabi=lp64 -mrvv-vector-bits=256

test_rvv_vector_bits:
  ...
  vsetivli  zero,8,e32,m2,ta,ma
  vle32.v   v2,0(a0)
  vle32.v   v4,0(a1)
  vadd.vv   v2,v2,v4
  vse32.v   v2,0(a2)

Passed the regression test of rvv.

gcc/ChangeLog:

* config/riscv/riscv-opts.h (enum rvv_vector_bits_enum): New enum for
different RVV vector bits.
* config/riscv/riscv.cc (riscv_convert_vector_bits): New func to
get the RVV vector bits, with given min_vlen.
(riscv_convert_vector_chunks): Combine the mrvv-vector-bits
option with min_vlen to RVV vector chunks.
(riscv_override_options_internal): Update comments and rename the
vector chunks.
* config/riscv/riscv.opt: Add option mrvv-vector-bits.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/rvv-vector-bits-1.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-2.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-3.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-4.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-opts.h | 16 ++
 gcc/config/riscv/riscv.cc | 49 ---
 gcc/config/riscv/riscv.opt| 47 ++
 .../riscv/rvv/base/rvv-vector-bits-1.c|  6 +++
 .../riscv/rvv/base/rvv-vector-bits-2.c| 20 
 .../riscv/rvv/base/rvv-vector-bits-3.c| 25 ++
 .../riscv/rvv/base/rvv-vector-bits-4.c|  6 +++
 7 files changed, 163 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-4.c

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 4edddbadc37..b2141190731 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -129,6 +129,22 @@ enum vsetvl_strategy_enum {
   VSETVL_OPT_NO_FUSION,
 };
 
+enum rvv_vector_bits_enum {
+  RVV_VECTOR_BITS_SCALABLE,
+  RVV_VECTOR_BITS_ZVL,
+  RVV_VECTOR_BITS_64 = 64,
+  RVV_VECTOR_BITS_128 = 128,
+  RVV_VECTOR_BITS_256 = 256,
+  RVV_VECTOR_BITS_512 = 512,
+  RVV_VECTOR_BITS_1024 = 1024,
+  RVV_VECTOR_BITS_2048 = 2048,
+  RVV_VECTOR_BITS_4096 = 4096,
+  RVV_VECTOR_BITS_8192 = 8192,
+  RVV_VECTOR_BITS_16384 = 16384,
+  RVV_VECTOR_BITS_32768 = 32768,
+  RVV_VECTOR_BITS_65536 = 65536,
+};
+
 #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && 
TARGET_64BIT))
 
 /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 5e984ee2a55..366d7ece383 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8801,13 +8801,50 @@ riscv_init_machine_status (void)
   return ggc_cleared_alloc ();
 }
 
-/* Return the VLEN value associated with -march.
+static int
+riscv_convert_vector_bits (int min_vlen)
+{
+  int rvv_bits = 0;
+
+  switch (rvv_vector_bits)
+{
+  case RVV_VECTOR_BITS_SCALABLE:
+  case RVV_VECTOR_BITS_ZVL:
+   rvv_bits = min_vlen;
+   break;
+  case RVV_VECTOR_BITS_64:
+  case RVV_VECTOR_BITS_128:
+  case RVV_VECTOR_BITS_256:
+  case RVV_VECTOR_BITS_512:
+  case RVV_VECTOR_BITS_1024:
+  case RVV_VECTOR_BITS_2048:
+  case RVV_VECTOR_BITS_4096:
+  case RVV_VECTOR_BITS_8192:
+  case RVV_VECTOR_BITS_16384:
+  case RVV_VECTOR_BITS_32768:
+  case RVV_VECTOR_BITS_65536:
+   rvv_bits = rvv_vector_bits;
+   

[PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-02-24 Thread pan2 . li
From: Pan Li 

Hi Richard & Tamar,

Try the DEF_INTERNAL_INT_EXT_FN as your suggestion.  By mapping
us_plus$a3 to the RTL representation (us_plus:m x y) in optabs.def.
And then expand_US_PLUS in internal-fn.cc.  Not very sure if my
understanding is correct for DEF_INTERNAL_INT_EXT_FN.

I am not sure if we still need DEF_INTERNAL_SIGNED_OPTAB_FN here, given
the RTL representation has (ss_plus:m x y) and (us_plus:m x y) already.

Note this patch is a draft for validation, no test are invovled here.

gcc/ChangeLog:

* builtins.def (BUILT_IN_US_PLUS): Add builtin def.
(BUILT_IN_US_PLUSIMAX): Ditto.
(BUILT_IN_US_PLUSL): Ditto.
(BUILT_IN_US_PLUSLL): Ditto.
(BUILT_IN_US_PLUSG): Ditto.
* config/riscv/riscv-protos.h (riscv_expand_us_plus): Add new
func decl for expanding us_plus.
* config/riscv/riscv.cc (riscv_expand_us_plus): Add new func
impl for expanding us_plus.
* config/riscv/riscv.md (us_plus3): Add new pattern impl
us_plus3.
* internal-fn.cc (expand_US_PLUS): Add new func impl to expand
US_PLUS.
* internal-fn.def (US_PLUS): Add new INT_EXT_FN.
* internal-fn.h (expand_US_PLUS): Add new func decl.
* match.pd: Add new simplify pattern for us_plus.
* optabs.def (OPTAB_NL): Add new OPTAB_NL to US_PLUS rtl.

Signed-off-by: Pan Li 
---
 gcc/builtins.def|  7 +
 gcc/config/riscv/riscv-protos.h |  1 +
 gcc/config/riscv/riscv.cc   | 46 +
 gcc/config/riscv/riscv.md   | 11 
 gcc/internal-fn.cc  | 26 +++
 gcc/internal-fn.def |  3 +++
 gcc/internal-fn.h   |  1 +
 gcc/match.pd| 17 
 gcc/optabs.def  |  2 ++
 9 files changed, 114 insertions(+)

diff --git a/gcc/builtins.def b/gcc/builtins.def
index f6f3e104f6a..0777b912cfa 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -1055,6 +1055,13 @@ DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTIMAX, 
"popcountimax", BT_FN_INT_UINTMAX
 DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTL, "popcountl", BT_FN_INT_ULONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTLL, "popcountll", 
BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTG, "popcountg", BT_FN_INT_VAR, 
ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
+
+DEF_GCC_BUILTIN(BUILT_IN_US_PLUS, "us_plus", BT_FN_INT_UINT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_US_PLUSIMAX, "us_plusimax", 
BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_US_PLUSL, "us_plusl", BT_FN_INT_ULONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_US_PLUSLL, "us_plusll", BT_FN_INT_ULONGLONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_US_PLUSG, "us_plusg", BT_FN_INT_VAR, 
ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
+
 DEF_EXT_LIB_BUILTIN(BUILT_IN_POSIX_MEMALIGN, "posix_memalign", 
BT_FN_INT_PTRPTR_SIZE_SIZE, ATTR_NOTHROW_NONNULL_LEAF)
 DEF_GCC_BUILTIN(BUILT_IN_PREFETCH, "prefetch", 
BT_FN_VOID_CONST_PTR_VAR, ATTR_NOVOPS_LEAF_LIST)
 DEF_LIB_BUILTIN(BUILT_IN_REALLOC, "realloc", BT_FN_PTR_PTR_SIZE, 
ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST)
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 80efdf2b7e5..ba6086f1f25 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -132,6 +132,7 @@ extern void riscv_asm_output_external (FILE *, const tree, 
const char *);
 extern bool
 riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
 extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
+extern void riscv_expand_us_plus (rtx, rtx, rtx);
 
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool 
*invert_ptr = 0);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 4100abc9dd1..23f08974f07 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -10657,6 +10657,52 @@ riscv_vector_mode_supported_any_target_p (machine_mode)
   return true;
 }
 
+/* Emit insn for the saturation addu, aka (x + y) | - ((x + y) < x).  */
+void
+riscv_expand_us_plus (rtx dest, rtx x, rtx y)
+{
+  machine_mode mode = GET_MODE (dest);
+  rtx pmode_sum = gen_reg_rtx (Pmode);
+  rtx pmode_lt = gen_reg_rtx (Pmode);
+  rtx pmode_x = gen_lowpart (Pmode, x);
+  rtx pmode_y = gen_lowpart (Pmode, y);
+  rtx pmode_dest = gen_reg_rtx (Pmode);
+
+  /* Step-1: sum = x + y  */
+  if (mode == SImode && mode != Pmode)
+{ /* Take addw to avoid the sum truncate.  */
+  rtx simode_sum = gen_reg_rtx (SImode);
+  riscv_emit_binary (PLUS, simode_sum, x, y);
+  emit_move_insn (pmode_sum, gen_lowpart (Pmode, simode_sum));
+}
+  else
+riscv_emit_binary (PLUS, pmode_sum, pmode_x, pmode_y);
+
+  /* Step-1.1: truncate sum for HI and QI as we have no insn for add QI/HI.  

[PATCH v1] RTL: Bugfix ICE after allow vector type in DSE

2024-02-25 Thread pan2 . li
From: Pan Li 

We allowed vector type for get_stored_val when read is less than or
equal to store in previous.  Unfortunately, we missed to adjust the
validate_subreg part accordingly.  For vector type, we don't need to
restrict the mode size is greater than the vector register size.

Thus, for example when gen_lowpart from E_V2SFmode to E_V4QImode, it
will have NULL_RTX(of course ICE after that) because of the mode size
is less than vector register size.  That also explain that gen_lowpart
from E_V8SFmode to E_V16QImode is valid here.

This patch would like to remove the the restriction for vector mode, to
rid of the ICE when gen_lowpart because of validate_subreg fails.

The below test are passed for this patch:

* The X86 bootstrap test.
* The fully riscv regression tests.

gcc/ChangeLog:

* emit-rtl.cc (validate_subreg): Bypass register size check
if the mode is vector.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/ssa-fre-44.c: Add ftree-vectorize to trigger
the ICE.
* gcc.target/riscv/rvv/base/bug-6.c: New test.

Signed-off-by: Pan Li 
---
 gcc/emit-rtl.cc   |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c|  2 +-
 .../gcc.target/riscv/rvv/base/bug-6.c | 22 +++
 3 files changed, 25 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 1856fa4884f..45c6301b487 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -934,7 +934,8 @@ validate_subreg (machine_mode omode, machine_mode imode,
 ;
   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
  is the culprit here, and not the backends.  */
-  else if (known_ge (osize, regsize) && known_ge (isize, osize))
+  else if (known_ge (isize, osize) && (known_ge (osize, regsize)
+|| (VECTOR_MODE_P (imode) || VECTOR_MODE_P (omode
 ;
   /* Allow component subregs of complex and vector.  Though given the below
  extraction rules, it's not always clear what that means.  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
index f79b4c142ae..624a00a4f32 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre1" } */
+/* { dg-options "-O -fdump-tree-fre1 -O3 -ftree-vectorize" } */
 
 struct A { float x, y; };
 struct B { struct A u; };
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c
new file mode 100644
index 000..5bb00b8f587
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c
@@ -0,0 +1,22 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */
+
+struct A { float x, y; };
+struct B { struct A u; };
+
+extern void bar (struct A *);
+
+float
+f3 (struct B *x, int y)
+{
+  struct A p = {1.0f, 2.0f};
+  struct A *q = &x[y].u;
+
+  __builtin_memcpy (&q->x, &p.x, sizeof (float));
+  __builtin_memcpy (&q->y, &p.y, sizeof (float));
+
+  bar (&p);
+
+  return x[y].u.x + x[y].u.y;
+}
-- 
2.34.1



[PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-02-26 Thread pan2 . li
From: Pan Li 

We allowed vector type for get_stored_val when read is less than or
equal to store in previous.  Unfortunately, we missed to adjust the
validate_subreg part accordingly.  When the vector type's size is
less than vector register, it will be considered as invalid in the
validate_subreg.

Consider the validate_subreg is kind of a can with worms and we are
in stage 4.  We will fix the issue from the DES side, and make sure
the subreg is valid for both the read_mode and store_mode before
perform the real gen_lowpart.

The below test are passed for this patch:

* The x86 bootstrap test.
* The x86 regression test.
* The riscv regression test.
* The aarch64 regression test.

gcc/ChangeLog:

* dse.cc (get_stored_val): Add validate_subreg check before
perform the gen_lowpart for rtl.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/ssa-fre-44.c: Add compile option to trigger
the ICE.
* gcc.target/riscv/rvv/base/bug-6.c: New test.

Signed-off-by: Pan Li 
---
 gcc/dse.cc|  4 +++-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c|  2 +-
 .../gcc.target/riscv/rvv/base/bug-6.c | 22 +++
 3 files changed, 26 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c

diff --git a/gcc/dse.cc b/gcc/dse.cc
index edc7a1dfecf..1596da91da0 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode 
read_mode,
 copy_rtx (store_info->const_rhs));
   else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
 && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
-&& targetm.modes_tieable_p (read_mode, store_mode))
+&& targetm.modes_tieable_p (read_mode, store_mode)
+&& validate_subreg (read_mode, store_mode, copy_rtx (store_info->rhs),
+   subreg_lowpart_offset (read_mode, store_mode)))
 read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
   else
 read_reg = extract_low_bits (read_mode, store_mode,
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
index f79b4c142ae..624a00a4f32 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre1" } */
+/* { dg-options "-O -fdump-tree-fre1 -O3 -ftree-vectorize" } */
 
 struct A { float x, y; };
 struct B { struct A u; };
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c
new file mode 100644
index 000..5bb00b8f587
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c
@@ -0,0 +1,22 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */
+
+struct A { float x, y; };
+struct B { struct A u; };
+
+extern void bar (struct A *);
+
+float
+f3 (struct B *x, int y)
+{
+  struct A p = {1.0f, 2.0f};
+  struct A *q = &x[y].u;
+
+  __builtin_memcpy (&q->x, &p.x, sizeof (float));
+  __builtin_memcpy (&q->y, &p.y, sizeof (float));
+
+  bar (&p);
+
+  return x[y].u.x + x[y].u.y;
+}
-- 
2.34.1



[PATCH v2] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-27 Thread pan2 . li
From: Pan Li 

This patch would like to introduce one new gcc option for RVV. To
appoint the bits size of one RVV vector register. Valid arguments to
'-mrvv-vector-bits=' are:

* zvl

The zvl will pick up the zvl*b from the march option. For example,
the mrvv-vector-bits will be 1024 when march=rv64gcv_zvl1024b.

The below test are passed for this patch.

* The riscv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-opts.h (enum rvv_vector_bits_enum): New enum for
different RVV vector bits.
* config/riscv/riscv.cc (riscv_convert_vector_bits): New func to
get the RVV vector bits, with given min_vlen.
(riscv_convert_vector_chunks): Combine the mrvv-vector-bits
option with min_vlen to RVV vector chunks.
(riscv_override_options_internal): Update comments and rename the
vector chunks.
* config/riscv/riscv.opt: Add option mrvv-vector-bits.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/rvv-vector-bits-1.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-2.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-3.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-opts.h |  7 +
 gcc/config/riscv/riscv.cc | 31 +++
 gcc/config/riscv/riscv.opt| 11 +++
 .../riscv/rvv/base/rvv-vector-bits-1.c|  7 +
 .../riscv/rvv/base/rvv-vector-bits-2.c|  7 +
 .../riscv/rvv/base/rvv-vector-bits-3.c| 25 +++
 6 files changed, 82 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-3.c

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 4edddbadc37..0162e00515b 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -129,6 +129,13 @@ enum vsetvl_strategy_enum {
   VSETVL_OPT_NO_FUSION,
 };
 
+/* RVV vector bits for option -mrvv-vector-bits
+   zvl indicates take the bits of zvl*b provided by march as vector bits.
+ */
+enum rvv_vector_bits_enum {
+  RVV_VECTOR_BITS_ZVL,
+};
+
 #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && 
TARGET_64BIT))
 
 /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 5e984ee2a55..d18e5226bce 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8801,13 +8801,32 @@ riscv_init_machine_status (void)
   return ggc_cleared_alloc ();
 }
 
-/* Return the VLEN value associated with -march.
+static int
+riscv_convert_vector_bits (int min_vlen)
+{
+  int rvv_bits = 0;
+
+  switch (rvv_vector_bits)
+{
+  case RVV_VECTOR_BITS_ZVL:
+   rvv_bits = min_vlen;
+   break;
+  default:
+   gcc_unreachable ();
+}
+
+  return rvv_bits;
+}
+
+/* Return the VLEN value associated with -march and -mwrvv-vector-bits.
TODO: So far we only support length-agnostic value. */
 static poly_uint16
-riscv_convert_vector_bits (struct gcc_options *opts)
+riscv_convert_vector_chunks (struct gcc_options *opts)
 {
   int chunk_num;
   int min_vlen = TARGET_MIN_VLEN_OPTS (opts);
+  int rvv_bits = riscv_convert_vector_bits (min_vlen);
+
   if (min_vlen > 32)
 {
   /* When targetting minimum VLEN > 32, we should use 64-bit chunk size.
@@ -8826,7 +8845,7 @@ riscv_convert_vector_bits (struct gcc_options *opts)
   - TARGET_MIN_VLEN = 2048bit: [256,256]
   - TARGET_MIN_VLEN = 4096bit: [512,512]
   FIXME: We currently DON'T support TARGET_MIN_VLEN > 4096bit.  */
-  chunk_num = min_vlen / 64;
+  chunk_num = rvv_bits / 64;
 }
   else
 {
@@ -8848,7 +8867,7 @@ riscv_convert_vector_bits (struct gcc_options *opts)
   if (TARGET_VECTOR_OPTS_P (opts))
 {
   if (opts->x_riscv_autovec_preference == RVV_FIXED_VLMAX)
-   return (int) min_vlen / (riscv_bytes_per_vector_chunk * 8);
+   return (int) rvv_bits / (riscv_bytes_per_vector_chunk * 8);
   else
return poly_uint16 (chunk_num, chunk_num);
 }
@@ -8920,8 +8939,8 @@ riscv_override_options_internal (struct gcc_options *opts)
   if (TARGET_VECTOR && TARGET_BIG_ENDIAN)
 sorry ("Current RISC-V GCC does not support RVV in big-endian mode");
 
-  /* Convert -march to a chunks count.  */
-  riscv_vector_chunks = riscv_convert_vector_bits (opts);
+  /* Convert -march and -mrvv-vector-bits to a chunks count.  */
+  riscv_vector_chunks = riscv_convert_vector_chunks (opts);
 }
 
 /* Implement TARGET_OPTION_OVERRIDE.  */
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 20685c42aed..42ea8efd05d 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -607,3 +607,14 @@ Enum(stringop_strategy) String(vector) 
Value(ST

[PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread pan2 . li
From: Pan Li 

This patch would like to introduce one new gcc option for RVV. To
appoint the bits size of one RVV vector register. Valid arguments to
'-mrvv-vector-bits=' are:

* scalable
* zvl

The scalable will pick up the zvl*b in the march as the minimal vlen.
For example, the minimal vlen will be 512 when
march=rv64gcv_zvl512b and mrvv-vector-bits=scalable.

The zvl will pick up the zvl*b in the march as exactly vlen.
For example, the vlen will be 1024 exactly when
march=rv64gcv_zvl1024b and mrvv-vector-bits=zvl.

Given below sample:

void test_rvv_vector_bits ()
{
  vint32m1_t x;
  asm volatile ("def %0": "=vr"(x));
  asm volatile (""::: "v0",   "v1",  "v2",  "v3",  "v4",  "v5",  "v6",  "v7",
  "v8",   "v9", "v10", "v11", "v12", "v13", "v14", "v15",
  "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23",
  "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31");
  asm volatile ("use %0": : "vr"(x));
}

With -march=rv64gcv_zvl128b -mrvv-vector-bits=scalable we have (for min_vlen >= 
128)
  csrrt0,vlenb
  sub sp,sp,t0
  def v1
  vs1r.v  v1,0(sp)
  vl1re32.v   v1,0(sp)
  use v1
  csrrt0,vlenb
  add sp,sp,t0
  jr  ra

With -march=rv64gcv_zvl128b -mrvv-vector-bits=zvl we have (for vlen = 128)
  addisp,sp,-16
  def v1
  vs1r.v  v1,0(sp)
  vl1re32.v   v1,0(sp)
  use v1
  addisp,sp,16
  jr  ra

The below test are passed for this patch.

* The riscv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-opts.h (enum rvv_vector_bits_enum): New enum for
different RVV vector bits.
* config/riscv/riscv.cc (riscv_convert_vector_bits): New func to
get the RVV vector bits, with given min_vlen.
(riscv_convert_vector_chunks): Combine the mrvv-vector-bits
option with min_vlen to RVV vector chunks.
(riscv_override_options_internal): Update comments and rename the
vector chunks.
* config/riscv/riscv.opt: Add option mrvv-vector-bits.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/rvv-vector-bits-1.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-2.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-3.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-4.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-5.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-6.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-opts.h |  8 +
 gcc/config/riscv/riscv.cc | 35 +++
 gcc/config/riscv/riscv.opt| 14 
 .../riscv/rvv/base/rvv-vector-bits-1.c|  7 
 .../riscv/rvv/base/rvv-vector-bits-2.c|  7 
 .../riscv/rvv/base/rvv-vector-bits-3.c|  9 +
 .../riscv/rvv/base/rvv-vector-bits-4.c|  9 +
 .../riscv/rvv/base/rvv-vector-bits-5.c| 17 +
 .../riscv/rvv/base/rvv-vector-bits-6.c| 17 +
 9 files changed, 116 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-6.c

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 4edddbadc37..eefd2f9e01c 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -129,6 +129,14 @@ enum vsetvl_strategy_enum {
   VSETVL_OPT_NO_FUSION,
 };
 
+/* RVV vector bits for option -mrvv-vector-bits
+   zvl indicates take the bits of zvl*b provided by march as vector bits.
+ */
+enum rvv_vector_bits_enum {
+  RVV_VECTOR_BITS_SCALABLE,
+  RVV_VECTOR_BITS_ZVL,
+};
+
 #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && 
TARGET_64BIT))
 
 /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 5e984ee2a55..b6b133210ff 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8801,13 +8801,33 @@ riscv_init_machine_status (void)
   return ggc_cleared_alloc ();
 }
 
-/* Return the VLEN value associated with -march.
+static int
+riscv_convert_vector_bits (int min_vlen)
+{
+  int rvv_bits = 0;
+
+  switch (rvv_vector_bits)
+{
+  case RVV_VECTOR_BITS_ZVL:
+  case RVV_VECTOR_BITS_SCALABLE:
+   rvv_bits = min_vlen;
+   break;
+  default:
+   gcc_unreachable ();
+}
+
+  return rvv_bits;
+}
+
+/* Return the VLEN value associated with -march and -mwrvv-vector-bits.
TODO: So far we only support length-agnostic value. */
 static poly_uint16
-riscv_conv

[PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread pan2 . li
From: Pan Li 

This patch would like to introduce one new gcc option for RVV. To
appoint the bits size of one RVV vector register. Valid arguments to
'-mrvv-vector-bits=' are:

* scalable
* zvl

The scalable will pick up the zvl*b in the march as the minimal vlen.
For example, the minimal vlen will be 512 when
march=rv64gcv_zvl512b and mrvv-vector-bits=scalable.

The zvl will pick up the zvl*b in the march as exactly vlen.
For example, the vlen will be 1024 exactly when
march=rv64gcv_zvl1024b and mrvv-vector-bits=zvl.

Given below sample:

void test_rvv_vector_bits ()
{
  vint32m1_t x;
  asm volatile ("def %0": "=vr"(x));
  asm volatile (""::: "v0",   "v1",  "v2",  "v3",  "v4",  "v5",  "v6",  "v7",
  "v8",   "v9", "v10", "v11", "v12", "v13", "v14", "v15",
  "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23",
  "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31");
  asm volatile ("use %0": : "vr"(x));
}

With -march=rv64gcv_zvl128b -mrvv-vector-bits=scalable we have (for min_vlen >= 
128)
  csrrt0,vlenb
  sub sp,sp,t0
  def v1
  vs1r.v  v1,0(sp)
  vl1re32.v   v1,0(sp)
  use v1
  csrrt0,vlenb
  add sp,sp,t0
  jr  ra

With -march=rv64gcv_zvl128b -mrvv-vector-bits=zvl we have (for vlen = 128)
  addisp,sp,-16
  def v1
  vs1r.v  v1,0(sp)
  vl1re32.v   v1,0(sp)
  use v1
  addisp,sp,16
  jr  ra

The below test are passed for this patch.

* The riscv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-opts.h (enum rvv_vector_bits_enum): New enum for
different RVV vector bits.
* config/riscv/riscv.cc (riscv_convert_vector_bits): New func to
get the RVV vector bits, with given min_vlen.
(riscv_convert_vector_chunks): Combine the mrvv-vector-bits
option with min_vlen to RVV vector chunks.
(riscv_override_options_internal): Update comments and rename the
vector chunks.
* config/riscv/riscv.opt: Add option mrvv-vector-bits.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/rvv-vector-bits-1.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-2.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-3.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-4.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-5.c: New test.
* gcc.target/riscv/rvv/base/rvv-vector-bits-6.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-opts.h |  8 +
 gcc/config/riscv/riscv.cc | 35 +++
 gcc/config/riscv/riscv.opt| 14 
 .../riscv/rvv/base/rvv-vector-bits-1.c|  7 
 .../riscv/rvv/base/rvv-vector-bits-2.c|  7 
 .../riscv/rvv/base/rvv-vector-bits-3.c|  9 +
 .../riscv/rvv/base/rvv-vector-bits-4.c|  9 +
 .../riscv/rvv/base/rvv-vector-bits-5.c| 17 +
 .../riscv/rvv/base/rvv-vector-bits-6.c| 17 +
 9 files changed, 116 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/rvv-vector-bits-6.c

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 4edddbadc37..2a311c9d2a3 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -129,6 +129,14 @@ enum vsetvl_strategy_enum {
   VSETVL_OPT_NO_FUSION,
 };
 
+/* RVV vector bits for option -mrvv-vector-bits, default is scalable.  */
+enum rvv_vector_bits_enum {
+  /* scalable indicates taking the value of zvl*b as the minimal vlen.  */
+  RVV_VECTOR_BITS_SCALABLE,
+  /* zvl indicates taking the value of zvl*b as the exactly vlen.  */
+  RVV_VECTOR_BITS_ZVL,
+};
+
 #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && 
TARGET_64BIT))
 
 /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 5e984ee2a55..b6b133210ff 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8801,13 +8801,33 @@ riscv_init_machine_status (void)
   return ggc_cleared_alloc ();
 }
 
-/* Return the VLEN value associated with -march.
+static int
+riscv_convert_vector_bits (int min_vlen)
+{
+  int rvv_bits = 0;
+
+  switch (rvv_vector_bits)
+{
+  case RVV_VECTOR_BITS_ZVL:
+  case RVV_VECTOR_BITS_SCALABLE:
+   rvv_bits = min_vlen;
+   break;
+  default:
+   gcc_unreachable ();
+}
+
+  return rvv_bits;
+}
+
+/* Return the VLEN value associated with -march and -mwrvv-vector-bit

[PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

2024-01-10 Thread pan2 . li
From: Pan Li 

The insert_var_expansion_initialization depends on the
HONOR_SIGNED_ZEROS to initialize the unrolling variables
to +0.0f when -0.0f and no-signed-option.  Unfortunately,
we should always keep the -0.0f here because:

* The -0.0f is always the correct initial value.
* We need to support the target that always honor signed zero.

Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
instead of HONOR_SIGNED_ZEROS.  Then the target/backend can
decide to honor the no-signed-zero or not.

The below tests are passed for this patch:

* The riscv regression tests.
* The aarch64 regression tests.
* The x86 bootstrap and regression tests.

gcc/ChangeLog:

* loop-unroll.cc (insert_var_expansion_initialization): Leverage
MODE_HAS_SIGNED_ZEROS for expansion variable initialization.

gcc/testsuite/ChangeLog:

* gcc.dg/pr30957-1.c: Adjust tests cases for different scenarios.

Signed-off-by: Pan Li 
---
 gcc/loop-unroll.cc   |  4 +--
 gcc/testsuite/gcc.dg/pr30957-1.c | 48 
 2 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc
index 4176a21e308..bfdfe6c2bb7 100644
--- a/gcc/loop-unroll.cc
+++ b/gcc/loop-unroll.cc
@@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand 
*ve,
   rtx var, zero_init;
   unsigned i;
   machine_mode mode = GET_MODE (ve->reg);
-  bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode);
+  bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode);
 
   if (ve->var_expansions.length () == 0)
 return;
@@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand 
*ve,
 case MINUS:
   FOR_EACH_VEC_ELT (ve->var_expansions, i, var)
 {
- if (honor_signed_zero_p)
+ if (has_signed_zero_p)
zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode);
  else
zero_init = CONST0_RTX (mode);
diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
index 564410913ab..6a9d3d87932 100644
--- a/gcc/testsuite/gcc.dg/pr30957-1.c
+++ b/gcc/testsuite/gcc.dg/pr30957-1.c
@@ -20,16 +20,52 @@ foo (float d, int n)
   return accum;
 }
 
+float __attribute__((noinline))
+get_minus_zero()
+{
+  return 0.0 / -5.0;
+}
+
 int
 main ()
 {
-  /* When compiling standard compliant we expect foo to return -0.0.  But the
- variable expansion during unrolling optimization (for this testcase 
enabled
- by non-compliant -fassociative-math) instantiates copy(s) of the
- accumulator which it initializes with +0.0.  Hence we expect that foo
- returns +0.0.  */
-  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
+  /* The variable expansion in unroll requires option unsafe-math-optimizations
+ (aka -fno-signed-zeros, -fno-trapping-math, -fassociative-math
+ and -freciprocal-math).
+
+ When loop like above will have expansion after unrolling as below:
+
+ accum_1 += d_1;
+ accum_2 += d_2;
+ accum_3 += d_3;
+ ...
+
+ The accum_1, accum_2 and accum_3 need to be initialized. Given the
+ floating-point we have
+ +0.0f + -0.0f = +0.0f.
+
+ Thus, we should initialize the accum_* to -0.0 for correctness.  But
+ the things become more complicated when no-signed-zeros, as well as VLA
+ vectorizer mode which doesn't trigger variable expansion. Then we have:
+
+ Case 1: Trigger variable expansion but target doesn't honor 
no-signed-zero.
+   minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f.
+ Case 2: Trigger variable expansion but target does honor no-signed-zero.
+   minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f.
+ Case 3: No variable expansion but target doesn't honor no-signed-zero.
+   minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f.
+ Case 4: No variable expansion but target does honor no-signed-zero.
+   minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f.
+
+ The test case covers above 4 cases for running.
+ */
+  float minus_zero = get_minus_zero ();
+  float a = __builtin_copysignf (1.0, minus_zero);
+  float b = __builtin_copysignf (1.0, foo (minus_zero, 10));
+
+  if (a != b)
 abort ();
+
   exit (0);
 }
 
-- 
2.34.1



[PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

2024-01-11 Thread pan2 . li
From: Pan Li 

The insert_var_expansion_initialization depends on the
HONOR_SIGNED_ZEROS to initialize the unrolling variables
to +0.0f when -0.0f and no-signed-option.  Unfortunately,
we should always keep the -0.0f here because:

* The -0.0f is always the correct initial value.
* We need to support the target that always honor signed zero.

Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
instead of HONOR_SIGNED_ZEROS.  Then the target/backend can
decide to honor the no-signed-zero or not.

We also removed the testcase pr30957-1.c, as it makes undefined behavior
whether the return value is positive or negative.

The below tests are passed for this patch:

* The riscv regression tests.
* The aarch64 regression tests.
* The x86 bootstrap and regression tests.

gcc/ChangeLog:

* loop-unroll.cc (insert_var_expansion_initialization): Leverage
MODE_HAS_SIGNED_ZEROS for expansion variable initialization.

gcc/testsuite/ChangeLog:

* gcc.dg/pr30957-1.c: Remove.

Signed-off-by: Pan Li 
---
 gcc/loop-unroll.cc   |  4 ++--
 gcc/testsuite/gcc.dg/pr30957-1.c | 36 
 2 files changed, 2 insertions(+), 38 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/pr30957-1.c

diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc
index 4176a21e308..bfdfe6c2bb7 100644
--- a/gcc/loop-unroll.cc
+++ b/gcc/loop-unroll.cc
@@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand 
*ve,
   rtx var, zero_init;
   unsigned i;
   machine_mode mode = GET_MODE (ve->reg);
-  bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode);
+  bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode);
 
   if (ve->var_expansions.length () == 0)
 return;
@@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand 
*ve,
 case MINUS:
   FOR_EACH_VEC_ELT (ve->var_expansions, i, var)
 {
- if (honor_signed_zero_p)
+ if (has_signed_zero_p)
zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode);
  else
zero_init = CONST0_RTX (mode);
diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
deleted file mode 100644
index 564410913ab..000
--- a/gcc/testsuite/gcc.dg/pr30957-1.c
+++ /dev/null
@@ -1,36 +0,0 @@
-/* { dg-do run { xfail { mmix-*-* } } } */
-/* We don't (and don't want to) perform this optimisation on soft-float 
targets,
-   where each addition is a library call.  /
-/* { dg-require-effective-target hard_float } */
-/* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */
-/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math 
-fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */
-
-extern void abort (void);
-extern void exit (int);
-
-float __attribute__((noinline))
-foo (float d, int n)
-{
-  unsigned i;
-  float accum = d;
-
-  for (i = 0; i < n; i++)
-accum += d;
-
-  return accum;
-}
-
-int
-main ()
-{
-  /* When compiling standard compliant we expect foo to return -0.0.  But the
- variable expansion during unrolling optimization (for this testcase 
enabled
- by non-compliant -fassociative-math) instantiates copy(s) of the
- accumulator which it initializes with +0.0.  Hence we expect that foo
- returns +0.0.  */
-  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
-abort ();
-  exit (0);
-}
-
-/* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail 
mmix-*-* } } } */
-- 
2.34.1



[PATCH v1] RISC-V: Update the comments of riscv_v_ext_mode_p [NFC]

2024-01-11 Thread pan2 . li
From: Pan Li 

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_v_ext_mode_p): Update the
comments of predicate func riscv_v_ext_mode_p.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index df9799d9c5e..f829014a589 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1361,7 +1361,10 @@ riscv_v_ext_vls_mode_p (machine_mode mode)
   return false;
 }
 
-/* Return true if it is either RVV vector mode or RVV tuple mode.  */
+/* Return true if it is either of below modes.
+   1. RVV vector mode.
+   2. RVV tuple mode.
+   3. RVV vls mode.  */
 
 static bool
 riscv_v_ext_mode_p (machine_mode mode)
-- 
2.34.1



[PATCH v1] RISC-V: Fix asm checks regression due to recent middle-end change

2024-01-17 Thread pan2 . li
From: Pan Li 

The recent middle-end change result in some asm check failures.
This patch would like to fix the asm check by adjust the times.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/shift-1.c: Fix asm check
count.
* gcc.target/riscv/rvv/autovec/vls/shift-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/shift-3.c: Ditto.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c
index e57a0b6bdf3..cb5a1dbc9ff 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-1.c
@@ -53,5 +53,5 @@ DEF_OP_VV (shift, 128, int64_t, >>)
 DEF_OP_VV (shift, 256, int64_t, >>)
 DEF_OP_VV (shift, 512, int64_t, >>)
 
-/* { dg-final { scan-assembler-times 
{vsra\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 39 } } */
+/* { dg-final { scan-assembler-times 
{vsra\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 42 } } */
 /* { dg-final { scan-assembler-not {csrr} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c
index 9d1fa64232c..e626a52c2d8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-2.c
@@ -53,5 +53,5 @@ DEF_OP_VV (shift, 128, uint64_t, >>)
 DEF_OP_VV (shift, 256, uint64_t, >>)
 DEF_OP_VV (shift, 512, uint64_t, >>)
 
-/* { dg-final { scan-assembler-times 
{vsrl\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 39 } } */
+/* { dg-final { scan-assembler-times 
{vsrl\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 42 } } */
 /* { dg-final { scan-assembler-not {csrr} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c
index 8de1b9c0c41..244bee02e55 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c
@@ -53,5 +53,5 @@ DEF_OP_VV (shift, 128, int64_t, <<)
 DEF_OP_VV (shift, 256, int64_t, <<)
 DEF_OP_VV (shift, 512, int64_t, <<)
 
-/* { dg-final { scan-assembler-times 
{vsll\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 46 } } */
+/* { dg-final { scan-assembler-times 
{vsll\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 47 } } */
 /* { dg-final { scan-assembler-not {csrr} } } */
-- 
2.34.1



[PATCH v1] RISC-V: Bugfix for vls integer mode calling convention

2024-01-23 Thread pan2 . li
From: Pan Li 

According to the issue as below.

https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/416

When the mode size of vls integer mode is less than 2 * XLEN, we will
take the gpr/fpr for both the args and the return values. Instead of
the reference. For example the below code:

typedef short v8hi __attribute__ ((vector_size (16)));

v8hi __attribute__((noinline))
add (v8hi a, v8hi b)
{
  v8hi r = a + b;
  return r;
}

Before this patch:
add:
  vsetivli zero,8,e16,m1,ta,ma
  vle16.v  v1,0(a1) <== arg by reference
  vle16.v  v2,0(a2) <== arg by reference
  vadd.vv  v1,v1,v2
  vse16.v  v1,0(a0) <== return by reference
  ret

After this patch:
add:
  addi sp,sp,-32
  sd   a0,0(sp)  <== arg by register a0 - a3
  sd   a1,8(sp)
  sd   a2,16(sp)
  sd   a3,24(sp)
  addi a5,sp,16
  vsetivli zero,8,e16,m1,ta,ma
  vle16.v  v2,0(sp)
  vle16.v  v1,0(a5)
  vadd.vv  v1,v1,v2
  vse16.v  v1,0(sp)
  ld   a0,0(sp)  <== return by a0 - a1.
  ld   a1,8(sp)
  addi sp,sp,32
  jr   ra

For vls floating point, the things get more complicated.  We follow
the below rules.

1. Vls element count <= 2 and vls size <= 2 * xlen, go fpr.
2. Vls size <= 2 * xlen, go gpr.
3. Vls size > 2 * xlen, go reference.

One exceptions is V2DF mode, we treat vls mode as aggregated and we will
have TFmode here.  Unforturnately, the emit_move_multi_word cannot take
care of TFmode elegantly and we go to gpr for V2DF mode.

The riscv regression passed for this patch.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_v_ext_vector_or_tuple_mode_p):
New predicate function for vector or tuple vector.
(riscv_v_vls_mode_aggregate_reg_count): New function to
calculate the gpr/fpr count required by vls mode.
(riscv_gpr_unit_size): New function to get gpr in bytes.
(riscv_fpr_unit_size): New function to get fpr in bytes.
(riscv_v_vls_to_gpr_mode): New function convert vls mode to gpr mode.
(riscv_v_vls_to_fpr_mode): New function convert vls mode to fpr mode.
(riscv_pass_vls_aggregate_in_gpr_or_fpr): New function to return
the rtx of gpr/fpr for vls mode.
(riscv_mode_pass_by_reference_p): New predicate function to
indicate the mode will be passed by reference or not.
(riscv_get_arg_info): Add vls mode handling.
(riscv_pass_by_reference): Return false if arg info has no zero
gpr count.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add helper marcos.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-9.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-6.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 185 +-
 .../rvv/autovec/vls/calling-convention-1.c| 154 +++
 .../rvv/autovec/vls/calling-convention-10.c   |  51 +
 .../rvv/autovec/vls/calling-convention-2.c| 142 ++
 .../rvv/autovec/vls/calling-convention-3.c| 130 
 .../rvv/autovec/vls/calling-convention-4.c| 118 +++
 .../rvv/autovec/vls/calling-convention-5.c| 141 +
 .../rvv/autovec/vls/calling-convention-6.c| 129 
 .../rvv/autovec/vls/calling-convention-7.c| 120 
 .../rvv/autovec/vls/calling-convention-8.c|  43 
 .../rvv/autovec/vls/calling-convention-9.c|  51 +
 .../autovec/vls/calling-convention-run-1.c|  55 ++
 .../autovec/vls/calling-convention-run-2.c|  55 ++
 .../autovec/vls/calling-convention-run-3.c|  55 ++
 .../autovec/vls/calling-convention-run-4.c|  55 ++
 .../autovec/vls/calling-convention-run-5.c|  55 ++
 .../autovec/vls/calling-convention-run-6.c|  55 ++
 .../gcc.target/riscv/rvv/autovec/vls/def.h|  74 +++
 18 files changed, 1665 insertions(+), 3 de

[PATCH v2] RISC-V: Bugfix for vls mode aggregated in GPR calling convention

2024-01-30 Thread pan2 . li
From: Pan Li 

According to the issue as below.

https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/416

When the mode size of vls integer mode is less than 2 * XLEN, we will
take the gpr for both the args and the return values. Instead of the
reference. For example the below code:

typedef short v8hi __attribute__ ((vector_size (16)));

v8hi __attribute__((noinline))
add (v8hi a, v8hi b)
{
  v8hi r = a + b;
  return r;
}

Before this patch:
add:
  vsetivli zero,8,e16,m1,ta,ma
  vle16.v  v1,0(a1) <== arg by reference
  vle16.v  v2,0(a2) <== arg by reference
  vadd.vv  v1,v1,v2
  vse16.v  v1,0(a0) <== return by reference
  ret

After this patch:
add:
  addi sp,sp,-32
  sd   a0,0(sp)  <== arg by register a0 - a3
  sd   a1,8(sp)
  sd   a2,16(sp)
  sd   a3,24(sp)
  addi a5,sp,16
  vsetivli zero,8,e16,m1,ta,ma
  vle16.v  v2,0(sp)
  vle16.v  v1,0(a5)
  vadd.vv  v1,v1,v2
  vse16.v  v1,0(sp)
  ld   a0,0(sp)  <== return by a0 - a1.
  ld   a1,8(sp)
  addi sp,sp,32
  jr   ra

For vls floating point, we take the same rules as integer and passed by
the gpr or reference.  However, we can simplify the above code by vmv,
and avoid the read/write values to the stack.  We will prepare another
patch for it as it isn't the scope of bugfix.

The riscv regression passed for this patch.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_v_vls_mode_aggregate_gpr_count): New 
function to
calculate the gpr count required by vls mode.
(riscv_v_vls_to_gpr_mode): New function convert vls mode to gpr mode.
(riscv_pass_vls_aggregate_in_gpr): New function to return the rtx of gpr
for vls mode.
(riscv_get_arg_info): Add vls mode handling.
(riscv_pass_by_reference): Return false if arg info has no zero gpr 
count.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add new helper macro.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-9.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-6.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc |  75 +
 .../rvv/autovec/vls/calling-convention-1.c| 154 ++
 .../rvv/autovec/vls/calling-convention-10.c   |  51 ++
 .../rvv/autovec/vls/calling-convention-2.c| 142 
 .../rvv/autovec/vls/calling-convention-3.c| 130 +++
 .../rvv/autovec/vls/calling-convention-4.c| 118 ++
 .../rvv/autovec/vls/calling-convention-5.c| 141 
 .../rvv/autovec/vls/calling-convention-6.c| 129 +++
 .../rvv/autovec/vls/calling-convention-7.c| 118 ++
 .../rvv/autovec/vls/calling-convention-8.c|  43 +
 .../rvv/autovec/vls/calling-convention-9.c|  51 ++
 .../autovec/vls/calling-convention-run-1.c|  55 +++
 .../autovec/vls/calling-convention-run-2.c|  55 +++
 .../autovec/vls/calling-convention-run-3.c|  55 +++
 .../autovec/vls/calling-convention-run-4.c|  55 +++
 .../autovec/vls/calling-convention-run-5.c|  55 +++
 .../autovec/vls/calling-convention-run-6.c|  55 +++
 .../gcc.target/riscv/rvv/autovec/vls/def.h|  74 +
 18 files changed, 1556 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autove

[PATCH v1] RISC-V: Cleanup the comments for the psabi

2024-01-30 Thread pan2 . li
From: Pan Li 

This patch would like to cleanup some comments which are out of date or 
incorrect.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_get_arg_info): Cleanup comments.
(riscv_pass_by_reference): Ditto.
(riscv_fntype_abi): Ditto.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 529ef5e84b7..7713ad26c8d 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5067,8 +5067,7 @@ riscv_get_arg_info (struct riscv_arg_info *info, const 
CUMULATIVE_ARGS *cum,
   info->gpr_offset = cum->num_gprs;
   info->fpr_offset = cum->num_fprs;
 
-  /* When disable vector_abi or scalable vector argument is anonymous, this
- argument is passed by reference.  */
+  /* Passed by reference when the scalable vector argument is anonymous.  */
   if (riscv_v_ext_mode_p (mode) && !named)
 return NULL_RTX;
 
@@ -5265,8 +5264,9 @@ riscv_pass_by_reference (cumulative_args_t cum_v, const 
function_arg_info &arg)
  so we can avoid the call to riscv_get_arg_info in this case.  */
   if (cum != NULL)
 {
-  /* Don't pass by reference if we can use a floating-point register.  */
   riscv_get_arg_info (&info, cum, arg.mode, arg.type, arg.named, false);
+
+  /* Don't pass by reference if we can use a floating-point register.  */
   if (info.num_fprs)
return false;
 
@@ -5279,9 +5279,9 @@ riscv_pass_by_reference (cumulative_args_t cum_v, const 
function_arg_info &arg)
return false;
 }
 
-  /* When vector abi disabled(without --param=riscv-vector-abi option) or
- scalable vector argument is anonymous or cannot be passed through vector
- registers, this argument is passed by reference. */
+  /* Passed by reference when:
+ 1. The scalable vector argument is anonymous.
+ 2. Args cannot be passed through vector registers.  */
   if (riscv_v_ext_mode_p (arg.mode))
 return true;
 
@@ -5392,12 +5392,9 @@ riscv_arguments_is_vector_type_p (const_tree fntype)
 static const predefined_function_abi &
 riscv_fntype_abi (const_tree fntype)
 {
-  /* Implementing an experimental vector calling convention, the proposal
- can be viewed at the bellow link:
-   https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/389
-
- You can enable this feature via the `--param=riscv-vector-abi` compiler
- option.  */
+  /* Implement the vector calling convention.  For more details please
+ reference the below link.
+ https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/389  */
   if (riscv_return_value_is_vector_type_p (fntype)
  || riscv_arguments_is_vector_type_p (fntype))
 return riscv_v_abi ();
-- 
2.34.1



[PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread pan2 . li
From: Pan Li 

Refine the test cases for:

* Name convention.
* Add run case.

PR target/112929
PR target/112988

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr112929.c: Moved to...
* gcc.target/riscv/rvv/vsetvl/pr112929-1.c: ...here.
* gcc.target/riscv/rvv/vsetvl/pr112988.c: Moved to...
* gcc.target/riscv/rvv/vsetvl/pr112988-1.c: ...here.
* gcc.target/riscv/rvv/vsetvl/pr112929-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr112988-2.c: New test.

Signed-off-by: Pan Li 
---
 .../rvv/vsetvl/{pr112929.c => pr112929-1.c}   |  0
 .../gcc.target/riscv/rvv/vsetvl/pr112929-2.c  | 57 +++
 .../rvv/vsetvl/{pr112988.c => pr112988-1.c}   |  0
 .../gcc.target/riscv/rvv/vsetvl/pr112988-2.c  | 53 +
 4 files changed, 110 insertions(+)
 rename gcc/testsuite/gcc.target/riscv/rvv/vsetvl/{pr112929.c => pr112929-1.c} 
(100%)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-2.c
 rename gcc/testsuite/gcc.target/riscv/rvv/vsetvl/{pr112988.c => pr112988-1.c} 
(100%)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-2.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-1.c
similarity index 100%
rename from gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929.c
rename to gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-1.c
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-2.c
new file mode 100644
index 000..f2022026639
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112929-2.c
@@ -0,0 +1,57 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -fno-vect-cost-model" } */
+
+int printf(char *, ...);
+int a, l, i, p, q, t, n, o;
+int *volatile c;
+static int j;
+static struct pack_1_struct d;
+long e;
+char m = 5;
+short s;
+
+#pragma pack(1)
+struct pack_1_struct {
+  long c;
+  int d;
+  int e;
+  int f;
+  int g;
+  int h;
+  int i;
+} h, r = {1}, *f = &h, *volatile g;
+
+void add_em_up(int count, ...) {
+  __builtin_va_list ap;
+  __builtin_va_start(ap, count);
+  __builtin_va_end(ap);
+}
+
+int main() {
+  int u;
+  j = 0;
+
+  for (; j < 9; ++j) {
+u = ++t ? a : 0;
+if (u) {
+  int *v = &d.d;
+  *v = g || e;
+  *c = 0;
+  *f = h;
+}
+s = l && c;
+o = i;
+d.f || (p = 0);
+q |= n;
+  }
+
+  r = *f;
+
+  add_em_up(1, 1);
+  printf("%d\n", m);
+
+  if (m != 5)
+__builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-1.c
similarity index 100%
rename from gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988.c
rename to gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-1.c
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-2.c
new file mode 100644
index 000..e952b85b630
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112988-2.c
@@ -0,0 +1,53 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -fno-vect-cost-model" } */
+
+int a = 0;
+int p, q, r, x = 230;
+short d;
+int e[256];
+static struct f w;
+int *c = &r;
+
+short y(short z) {
+  return z * d;
+}
+
+#pragma pack(1)
+struct f {
+  int g;
+  short h;
+  int j;
+  char k;
+  char l;
+  long m;
+  long n;
+  int o;
+} s = {1}, v, t, *u = &v, *b = &s;
+
+void add_em_up(int count, ...) {
+  __builtin_va_list ap;
+  __builtin_va_start(ap, count);
+  __builtin_va_end(ap);
+}
+
+int main() {
+  int i = 0;
+  for (; i < 256; i++)
+e[i] = i;
+
+  p = 0;
+  for (; p <= 0; p++) {
+*c = 4;
+*u = t;
+x |= y(6 >= q);
+  }
+
+  *b = w;
+
+  add_em_up(1, 1);
+
+  if (a != 0 || q != 0 || p != 1 || r != 4 || x != 0xE6 || d != 0)
+__builtin_abort ();
+
+  return 0;
+}
-- 
2.34.1



[PATCH v1] RISC-V: Fix POLY INT handle bug

2023-12-17 Thread pan2 . li
From: Pan Li 

This patch fixes the following FAIL:
Running target
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
FAIL: gcc.dg/vect/fast-math-vect-complex-3.c execution test

The root cause is we generate incorrect codegen for (const_poly_int:DI
[549755813888, 549755813888])

Before this patch:

li  a7,0
vmv.v.x v0,a7

After this patch:

csrra2,vlenb
sllia2,a2,33
vmv.v.x v0,a2

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_expand_mult_with_const_int):
Change int into HOST_WIDE_INT.
(riscv_legitimize_poly_move): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/bug-3.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 10 +++--
 .../gcc.target/riscv/rvv/autovec/bug-3.c  | 39 +++
 2 files changed, 45 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index f60726711e8..3fef1ab1514 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2371,7 +2371,7 @@ riscv_expand_op (enum rtx_code code, machine_mode mode, 
rtx op0, rtx op1,
 
 static void
 riscv_expand_mult_with_const_int (machine_mode mode, rtx dest, rtx 
multiplicand,
- int multiplier)
+ HOST_WIDE_INT multiplier)
 {
   if (multiplier == 0)
 {
@@ -2380,7 +2380,7 @@ riscv_expand_mult_with_const_int (machine_mode mode, rtx 
dest, rtx multiplicand,
 }
 
   bool neg_p = multiplier < 0;
-  int multiplier_abs = abs (multiplier);
+  unsigned HOST_WIDE_INT multiplier_abs = abs (multiplier);
 
   if (multiplier_abs == 1)
 {
@@ -2475,8 +2475,10 @@ void
 riscv_legitimize_poly_move (machine_mode mode, rtx dest, rtx tmp, rtx src)
 {
   poly_int64 value = rtx_to_poly_int64 (src);
-  int offset = value.coeffs[0];
-  int factor = value.coeffs[1];
+  /* It use HOST_WIDE_INT intead of int since 32bit type is not enough
+ for e.g. (const_poly_int:DI [549755813888, 549755813888]).  */
+  HOST_WIDE_INT offset = value.coeffs[0];
+  HOST_WIDE_INT factor = value.coeffs[1];
   int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1];
   int div_factor = 0;
   /* Calculate (const_poly_int:MODE [m, n]) using scalar instructions.
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c
new file mode 100644
index 000..643e91b918e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d 
--param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=scalable 
-fno-vect-cost-model -O2 -ffast-math" } */
+
+#define N 16
+
+_Complex float a[N] =
+{ 10.0F + 20.0iF, 11.0F + 21.0iF, 12.0F + 22.0iF, 13.0F + 23.0iF,
+  14.0F + 24.0iF, 15.0F + 25.0iF, 16.0F + 26.0iF, 17.0F + 27.0iF,
+  18.0F + 28.0iF, 19.0F + 29.0iF, 20.0F + 30.0iF, 21.0F + 31.0iF,
+  22.0F + 32.0iF, 23.0F + 33.0iF, 24.0F + 34.0iF, 25.0F + 35.0iF };
+_Complex float b[N] =
+{ 30.0F + 40.0iF, 31.0F + 41.0iF, 32.0F + 42.0iF, 33.0F + 43.0iF,
+  34.0F + 44.0iF, 35.0F + 45.0iF, 36.0F + 46.0iF, 37.0F + 47.0iF,
+  38.0F + 48.0iF, 39.0F + 49.0iF, 40.0F + 50.0iF, 41.0F + 51.0iF,
+  42.0F + 52.0iF, 43.0F + 53.0iF, 44.0F + 54.0iF, 45.0F + 55.0iF };
+
+_Complex float c[N];
+_Complex float res[N] =
+{ -500.0F + 1000.0iF, -520.0F + 1102.0iF,
+  -540.0F + 1208.0iF, -560.0F + 1318.0iF,
+  -580.0F + 1432.0iF, -600.0F + 1550.0iF,
+  -620.0F + 1672.0iF, -640.0F + 1798.0iF,
+  -660.0F + 1928.0iF, -680.0F + 2062.0iF,
+  -700.0F + 2200.0iF, -720.0F + 2342.0iF,
+  -740.0F + 2488.0iF, -760.0F + 2638.0iF,
+  -780.0F + 2792.0iF, -800.0F + 2950.0iF };
+
+
+void
+foo (void)
+{
+  int i;
+
+  for (i = 0; i < N; i++)
+c[i] = a[i] * b[i];
+}
+
+/* { dg-final { scan-assembler-not {li\s+[a-x0-9]+,\s*0} } } */
+/* { dg-final { scan-assembler-times {slli\s+[a-x0-9]+,\s*[a-x0-9]+,\s*33} 1 } 
} */
-- 
2.34.1



[PATCH v1] RISC-V: Bugfix for the RVV const vector

2023-12-17 Thread pan2 . li
From: Pan Li 

This patch would like to fix one bug of const vector for interleave.
Assume we need to generate interleave const vector like below.

 V = {{4, -4, 3, -3, 2, -2, 1, -1,}

Before this patch:
vsetvl a3, zero, e64, m8, ta, ma
vid.v   v8v8 =  {0, 1, 2, 3, 4}
li  a6, -1
vmul.vx v8, v8, a6v8 =  {-0, -1, -2, -3, -4}
vadd.vi v24, v8, 4v24 = { 4,  3,  2,  1,  0}
vadd.vi v8, v8, -4v8 =  {-4, -5, -6, -7, -8}
li  a6, 32
vsll.vx v8, v8, a6v8 =  {0, -4, 0, -5, 0, -6, 0, -7,} for e32
vor v24, v24, v8  v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32

After this patch:
vsetvli a6,zero,e64,m8,ta,ma
vidv  v8  v8 =  {0, 1, 2, 3, 4}
li a7,-1
vmul.vx v16,v8,a7 v16 = {-0, -1, -2, -3, -4}
vaddvi v16,v16,4  v16 = { 4,  3,  2,  1, 0}
vaddvi v8,v8,-4   v8 =  {-4, -3, -2, -1, 0}
li a7,32
vsll.vx v8,v8,a7  v8 =  {0, -4, 0, -3, 0, -2,} for e32
vor.vv v16,v16,v8 v8 =  {4, -4, 3, -3, 2, -2,} for e32

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Take step2
instead of step1 for second series.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/const-vector-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc   |  2 +-
 .../riscv/rvv/autovec/const-vector-0.c| 39 +++
 2 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index eade8db4cf1..d1eb7a0a9a5 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src)
  rtx tmp2 = gen_reg_rtx (new_mode);
  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
  expand_vec_series (tmp2, base2,
-gen_int_mode (step1, new_smode));
+gen_int_mode (step2, new_smode));
  rtx shifted_tmp2 = expand_simple_binop (
new_mode, ASHIFT, tmp2,
gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c
new file mode 100644
index 000..4f83121c663
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d 
--param=riscv-autovec-lmul=m8 -ftree-vectorize -fno-vect-cost-model -O3 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define N 4
+struct C { int r, i; };
+
+/*
+** init_struct_data:
+** ...
+** vsetivli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m8,\s*ta,\s*ma
+** vid\.v\s+v8
+** li\s+[atx][0-9]+,\s*-1
+** vmul\.vx\s+v16,\s*v8,\s*[atx][0-9]+
+** vadd\.vi\s+v16,\s*v16,\s*4
+** vadd\.vi\s+v8,\s*v8,\s*-4
+** li\s+[axt][0-9]+,32
+** vsll\.vx\s+v8,\s*v8,\s*[atx][0-9]+
+** vor\.vv\s+v16,\s*v16,\s*v8
+** ...
+*/
+void
+init_struct_data (struct C * __restrict a, struct C * __restrict b,
+ struct C * __restrict c)
+{
+  int i;
+
+  for (i = 0; i < N; ++i)
+{
+  a[i].r = N - i;
+  a[i].i = i - N;
+
+  b[i].r = i - N;
+  b[i].i = i + N;
+
+  c[i].r = -1 - i;
+  c[i].i = 2 * N - 1 - i;
+}
+}
-- 
2.34.1



[PATCH v2] RISC-V: Bugfix for the RVV const vector

2023-12-17 Thread pan2 . li
From: Pan Li 

This patch would like to fix one bug of const vector for interleave.
Assume we need to generate interleave const vector like below.

 V = {{4, -4, 3, -3, 2, -2, 1, -1,}

Before this patch:
vsetvl a3, zero, e64, m8, ta, ma
vid.v   v8v8 =  {0, 1, 2, 3, 4}
li  a6, -1
vmul.vx v8, v8, a6v8 =  {-0, -1, -2, -3, -4}
vadd.vi v24, v8, 4v24 = { 4,  3,  2,  1,  0}
vadd.vi v8, v8, -4v8 =  {-4, -5, -6, -7, -8}
li  a6, 32
vsll.vx v8, v8, a6v8 =  {0, -4, 0, -5, 0, -6, 0, -7,} for e32
vor v24, v24, v8  v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32

After this patch:
vsetvli a6,zero,e64,m8,ta,ma
vid.v  v8  v8 =  {0, 1, 2, 3, 4}
li a7,-1
vmul.vx v16,v8,a7 v16 = {-0, -1, -2, -3, -4}
vaddvi v16,v16,4  v16 = { 4,  3,  2,  1, 0}
vaddvi v8,v8,-4   v8 =  {-4, -3, -2, -1, 0}
li a7,32
vsll.vx v8,v8,a7  v8 =  {0, -4, 0, -3, 0, -2,} for e32
vor.vv v16,v16,v8 v8 =  {4, -4, 3, -3, 2, -2,} for e32

It is not easy to add asm check stable enough for this case, as we need
to check the vadd -4 target comes from the vid output, which crosses 4
instructions up to point. Thus there is no test here and will be covered
by gcc.dg/vect/pr92420.c in the underlying patches.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Take step2
instead of step1 for second series.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index eade8db4cf1..d1eb7a0a9a5 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src)
  rtx tmp2 = gen_reg_rtx (new_mode);
  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
  expand_vec_series (tmp2, base2,
-gen_int_mode (step1, new_smode));
+gen_int_mode (step2, new_smode));
  rtx shifted_tmp2 = expand_simple_binop (
new_mode, ASHIFT, tmp2,
gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
-- 
2.34.1



[PATCH v1] RISC-V: Bugfix for the const vector in single steps

2023-12-19 Thread pan2 . li
From: Pan Li 

For generating the const vector with single step, we have code
gen similar as below.  We have npatterns = 4.

v1= {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... }

v2 (diff) = {3 - 0, 2 - 1, 1 - 2, 0 - 3, 7 - 4, 6 - 5, 5 - 6, 4 - 7...}
  = {3, 1, -1, 3, 3, 1, -1, 3 ...}

v1 = vd + vid.

But this requires the diff is npattern size repeated like {3, 1, -1, 3}
as above. And it cannot take care of single step as below:

{ -4, 4, -4 + 1, 4 + 1, -4 + 2, 4 + 2, -4 + 3, 4 + 3, ...

This patch would like to add the restriction to above code gen and
implement one for the general case.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Add restriction
for the vid-diff code gen and implement general one.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/bug-7.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc   | 73 +++
 .../gcc.target/riscv/rvv/autovec/bug-7.c  | 61 
 2 files changed, 119 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 486f5deb296..946588b7b1f 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1257,24 +1257,67 @@ expand_const_vector (rtx target, rtx src)
  else
{
  /* Generate the variable-length vector following this rule:
-{ a, b, a, b, a + step, b + step, a + step*2, b + step*2, ...}
-  E.g. { 3, 2, 1, 0, 7, 6, 5, 4, ... } */
- /* Step 2: Generate diff = TARGET - VID:
-{ 3-0, 2-1, 1-2, 0-3, 7-4, 6-5, 5-6, 4-7, ... }*/
+   { a, b, a + step, b + step, a + step*2, b + step*2, ... }  */
  rvv_builder v (builder.mode (), builder.npatterns (), 1);
- for (unsigned int i = 0; i < v.npatterns (); ++i)
+ poly_int64 ele_0 = rtx_to_poly_int64 (builder.elt (0));
+ poly_int64 ele_n
+   = rtx_to_poly_int64 (builder.elt (v.npatterns ()));
+
+ if (known_eq (ele_0 - 0, ele_n - v.npatterns ()))
+   {
+ /* Case 1: For example as below:
+{3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... }
+We have 3 - 0 = 3 equals 7 - 4 = 3, the sequence is
+repeated as below after minus vid.
+{3, 1, -1, -3, 3, 1, -1, -3...}
+Then we can simplify the diff code gen to at most
+npatterns().  */
+
+ /* Step 1: Generate diff = TARGET - VID.  */
+ for (unsigned int i = 0; i < v.npatterns (); ++i)
+   {
+poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i;
+v.quick_push (gen_int_mode (diff, v.inner_mode ()));
+   }
+
+ /* Step 2: Generate result = VID + diff.  */
+ rtx vec = v.build ();
+ rtx add_ops[] = {target, vid, vec};
+ emit_vlmax_insn (code_for_pred (PLUS, builder.mode ()),
+  BINARY_OP, add_ops);
+   }
+ else
{
- /* Calculate the diff between the target sequence and
-vid sequence.  The elt (i) can be either const_int or
-const_poly_int. */
- poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i;
- v.quick_push (gen_int_mode (diff, v.inner_mode ()));
+ /* Case 2: For example as below:
+{ -4, 4, -4 + 1, 4 + 1, -4 + 2, 4 + 2, -4 + 3, 4 + 3, ... }
+  */
+
+ /* Step 1: Generate { a, b, a, b, ... }  */
+ for (unsigned int i = 0; i < v.npatterns (); ++i)
+   v.quick_push (builder.elt (i));
+ rtx new_base = v.build ();
+
+ /* Step 2: Generate tmp = VID >> LOG2 (NPATTERNS).  */
+ rtx shift_count
+   = gen_int_mode (exact_log2 (builder.npatterns ()),
+   builder.inner_mode ());
+ rtx tmp = expand_simple_binop (builder.mode (), LSHIFTRT,
+vid, shift_count, NULL_RTX,
+false, OPTAB_DIRECT);
+
+ /* Step 3: Generate tmp2 = tmp * step.  */
+ rtx tmp2 = gen_reg_rtx (builder.mode ());
+ rtx step
+   = simplify_binary_operation (MINUS, builder.inner_mode (),
+builder.elt (v.npatterns()),
+builder.elt (0));
+ expand_vec_series (tmp2, const0_rtx, step, tmp);
+
+ /* Step 4: Generate target = tmp2 + new_base.  */
+ rtx 

[PATCH v2] RISC-V: Bugfix for the const vector in single steps

2023-12-19 Thread pan2 . li
From: Pan Li 

This patch would like to fix the below execution failure.

FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test

The will be one single step const vector like { -4, 4, -3, 5, -2, 6, -1, 7, 
...}.
For such const vector generation with single step, we will generate vid
+ diff here. For example as below, given npatterns = 4.

v1= {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... }
v2 (diff) = {3 - 0, 2 - 1, 1 - 2, 0 - 3, 7 - 4, 6 - 5, 5 - 6, 4 - 7...}
  = {3, 1, -1, 3, 3, 1, -1, 3 ...}
v1 = vd + vid.

Unfortunately, that cannot work well for { -4, 4, -3, 5, -2, 6, -1, 7, ...}
because it has one implicit requirement for the diff. Aka, the diff
sequence in npattern are repeated. For example the v2 (diff) as above.

The diff between { -4, 4, -3, 5, -2, 6, -1, 7, ...} and vid are not
npattern size repeated and then we have wrong code here. We implement
one new code gen the sequence like { -4, 4, -3, 5, -2, 6, -1, 7, ...}.

The below tests are passed for this patch.

* The RV64 regression test with rv64gcv configuration.
* The run test gcc.dg/vect/pr92420.c for below configurations.

riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Add restriction
for the vid-diff code gen and implement general one.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/bug-7.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc   | 84 +++
 .../gcc.target/riscv/rvv/autovec/bug-7.c  | 61 ++
 2 files changed, 130 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 486f5deb296..5a5899e85ae 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1257,24 +1257,78 @@ expand_const_vector (rtx target, rtx src)
  else
{
  /* Generate the variable-length vector following this rule:
-{ a, b, a, b, a + step, b + step, a + step*2, b + step*2, ...}
-  E.g. { 3, 2, 1, 0, 7, 6, 5, 4, ... } */
- /* Step 2: Generate diff = TARGET - VID:
-{ 3-0, 2-1, 1-2, 0-3, 7-4, 6-5, 5-6, 4-7, ... }*/
+   { a, b, a + st

[PATCH v3] RISC-V: Bugfix for the const vector in single steps

2023-12-20 Thread pan2 . li
From: Pan Li 

This patch would like to fix the below execution failure when build with
"-march=rv64gcv_zvl512b -mabi=lp64d -mcmodel=medlow 
--param=riscv-autovec-lmul=m8 -ftree-vectorize -fno-vect-cost-model -O3"

FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test

The will be one single step const vector like { -4, 4, -3, 5, -2, 6, -1, 7, 
...}.
For such const vector generation with single step, we will generate vid
+ diff here. For example as below, given npatterns = 4.

v1= {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... }
v2 (diff) = {3 - 0, 2 - 1, 1 - 2, 0 - 3, 7 - 4, 6 - 5, 5 - 6, 4 - 7...}
  = {3, 1, -1, 3, 3, 1, -1, 3 ...}
v1 = vd + vid.

Unfortunately, that cannot work well for { -4, 4, -3, 5, -2, 6, -1, 7, ...}
because it has one implicit requirement for the diff. Aka, the diff
sequence in npattern are repeated. For example the v2 (diff) as above.

The diff between { -4, 4, -3, 5, -2, 6, -1, 7, ...} and vid are not
npattern size repeated and then we have wrong code here. We implement
one new code gen the sequence like { -4, 4, -3, 5, -2, 6, -1, 7, ...}.

The below tests are passed for this patch.

* The RV64 regression test with rv64gcv configuration.
* The run test gcc.dg/vect/pr92420.c for below configurations.

riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax

gcc/ChangeLog:

* config/riscv/riscv-v.cc (rvv_builder::npatterns_vid_diff_repeated_p):
New function to predicate the diff to vid is repeated or not.
(expand_const_vector): Add restriction
for the vid-diff code gen and implement general one.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/bug-7.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc   | 111 +++---
 .../gcc.target/riscv/rvv/autovec/bug-7.c  |  61 ++
 2 files changed, 156 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 486f5deb296..3b9be255799 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -433,6 +433,7 @@ public:
   bool single_step_npatterns_p () const;
   bool npatterns_all_equal_p () const;
   bool interleaved_stepped_npatterns_p () const;
+  bool npatterns_vid_diff_repeated_p 

[PATCH v1] RISC-V: XFail the signbit-5 run test for RVV

2023-12-20 Thread pan2 . li
From: Pan Li 

This patch would like to XFail the signbit-5 run test case for
the RVV.  Given the case has one limitation like "This test does not
work when the truth type does not match vector type." in the beginning
of the test file.  Aka, the RVV vector truth type is not integer type.

The target board of riscv-sim like below will pick up `-march=rv64gcv`
when building the run test elf. Thus, the RVV cannot bypass this test
case like aarch64_sve with additional option `-march=armv8-a`.

  riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow

For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`.

The signbit-5.c passed test with below configurations.

* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zv

[PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-23 Thread pan2 . li
From: Pan Li 

This patch would like to XFAIL the test case pr30957-1.c for the RVV when
build the elf with some configurations (list at the end of the log)
It will be vectorized during vect_transform_loop with a variable factor.
It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
Of course, it will do nothing during unroll_loops when loop->unroll is 1.

After this patch the loops vectorized with a variable factor of the RVV
will be treated as XFAIL by the tree dump.

Aka the blow configuration will be treated as XFAIL and we still need
further investigation for the failures of other configurations.

* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax

gcc/testsuite/ChangeLog:

* gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with

[PATCH v2] RISC-V: XFail the signbit-5 run test for RVV

2023-12-23 Thread pan2 . li
From: Pan Li 

This patch would like to XFail the signbit-5 run test case for
the RVV.  Given the case has one limitation like "This test does not
work when the truth type does not match vector type." in the beginning
of the test file.  Aka, the RVV vector truth type is not integer type.

The target board of riscv-sim like below will pick up `-march=rv64gcv`
when building the run test elf. Thus, the RVV cannot bypass this test
case like aarch64_sve with additional option `-march=armv8-a`.

  riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow

For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`.

The signbit-5.c passed test with below configurations but we need
further investigation for the failures of other configurations.

* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-auto

[PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-26 Thread pan2 . li
From: Pan Li 

This patch would like to XFAIL the test case pr30957-1.c for the RVV when
build the elf with some configurations (list at the end of the log)
It will be vectorized during vect_transform_loop with a variable factor.
It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
Of course, it will do nothing during unroll_loops when loop->unroll is 1.

The aarch64_sve may have the similar issue but it initialize the const
`0.0 / -5.0` in the test file to `+0.0` before pass to the function foo.
Then it will pass the execution test.

aarch64:
moviv0.2s, #0x0
stp x29, x30, [sp, #-16]!
mov w0, #0xa
mov x29, sp
bl  400280  <== s0 is +0.0

Unfortunately, the riscv initialize the the const `0.0 / -5.0` to the
`-0.0`, and then pass it to the function foo. Of course it the execution
test will fail.

riscv:
flw fa0,388(gp) # 1299c <__SDATA_BEGIN__+0x4>
addisp,sp,-16
li  a0,10
sd  ra,8(sp)
jal 101fc   <== fa0 is -0.0

After this patch the loops vectorized with a variable factor of the RVV
will be treated as XFAIL by the tree dump when riscv_v and
variable_vect_length.

The below configurations are validated as XFAIL for RV64.

* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax

gcc/testsuite/ChangeLog:

* gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with
variable length.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.dg/pr30957-1.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
index 564410913ab..7a7242ec16d 100644
--- a/gcc/testsuite/gcc.dg/pr30957-1.c
+++ b/gcc/testsuite/gcc.dg/pr30957-1.c
@@ -3,7 +3,7 @@
where each addition is a library call.  /
 /* { dg-require-effective-target hard_float } */
 /* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */
-/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math 
-fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */
+/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math 
-fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll 
-fdump-tree-vect-details" } */
 
 extern void abort (void);
 extern void exit (int);
@@ -34,3 +34,4 @@ main ()
 }
 
 /* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail 
mmix-*-* } } } */
+/* { dg-f

[PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option

2024-01-02 Thread pan2 . li
From: Pan Li 

According to the sematics of no-signed-zeros option, the backend
like RISC-V should treat the minus zero -0.0f as plus zero 0.0f.

Consider below example with option -fno-signed-zeros.

void
test (float *a)
{
  *a = -0.0;
}

We will generate code as below, which doesn't treat the minus zero
as plus zero.

test:
  lui  a5,%hi(.LC0)
  flw  fa5,%lo(.LC0)(a5)
  fsw  fa5,0(a0)
  ret

.LC0:
  .word -2147483648 // aka -0.0 (0x8000 in hex)

This patch would like to fix the bug and treat the minus zero -0.0
as plus zero, aka +0.0. Thus after this patch we will have asm code
as below for the above sampe code.

test:
  sw zero,0(a0)
  ret

This patch also fix the run failure of the test case pr30957-1.c. The
below tests are passed for this patch.

* The riscv regression tests.
* The pr30957-1.c run tests.

gcc/ChangeLog:

* config/riscv/constraints.md: Leverage func 
riscv_float_const_zero_rtx_p
for predicating the rtx is const zero float or not.
* config/riscv/predicates.md: Ditto.
* config/riscv/riscv.cc (riscv_const_insns): Ditto.
(riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is
const zero float or not.
(riscv_const_zero_rtx_p): New func impl for predicating the rtx
is const zero (both int and fp) or not.
* config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p):
New func decl.
(riscv_const_zero_rtx_p): Ditto.
* config/riscv/riscv.md: Making sure the operand[1] of movfp is
CONST0_RTX when the operand[1] is const zero float.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/no-signed-zeros-0.c: New test.
* gcc.target/riscv/no-signed-zeros-1.c: New test.
* gcc.target/riscv/no-signed-zeros-2.c: New test.
* gcc.target/riscv/no-signed-zeros-3.c: New test.
* gcc.target/riscv/no-signed-zeros-4.c: New test.
* gcc.target/riscv/no-signed-zeros-5.c: New test.
* gcc.target/riscv/no-signed-zeros-run-0.c: New test.
* gcc.target/riscv/no-signed-zeros-run-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/constraints.md   |  2 +-
 gcc/config/riscv/predicates.md|  2 +-
 gcc/config/riscv/riscv-protos.h   |  2 +
 gcc/config/riscv/riscv.cc | 35 -
 gcc/config/riscv/riscv.md | 49 ---
 .../gcc.target/riscv/no-signed-zeros-0.c  | 26 ++
 .../gcc.target/riscv/no-signed-zeros-1.c  | 28 +++
 .../gcc.target/riscv/no-signed-zeros-2.c  | 26 ++
 .../gcc.target/riscv/no-signed-zeros-3.c  | 28 +++
 .../gcc.target/riscv/no-signed-zeros-4.c  | 26 ++
 .../gcc.target/riscv/no-signed-zeros-5.c  | 28 +++
 .../gcc.target/riscv/no-signed-zeros-run-0.c  | 36 ++
 .../gcc.target/riscv/no-signed-zeros-run-1.c  | 36 ++
 13 files changed, 314 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index de4359af00d..db1d5e1385f 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -108,7 +108,7 @@ (define_constraint "DnS"
 (define_constraint "G"
   "@internal"
   (and (match_code "const_double")
-   (match_test "op == CONST0_RTX (mode)")))
+   (match_test "riscv_float_const_zero_rtx_p (op)")))
 
 (define_memory_constraint "A"
   "An address that is held in a general-purpose register."
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index b87a6900841..b428d842101 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -78,7 +78,7 @@ (define_predicate "sleu_operand"
 
 (define_predicate "const_0_operand"
   (and (match_code "const_int,const_wide_int,const_double,const_vector")
-   (match_test "op == CONST0_RTX (GET_MODE (op))")))
+   (match_test "riscv_const_zero_rtx_p (op)")))
 
 (define_predicate "const_1_operand"
   (and (match_code "const_int,const_wide_int,const_vector")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 31049ef7523..fcf30e084a3 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, 
const char *);
 extern bool
 riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int

[PATCH v2] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-11-30 Thread pan2 . li
From: Pan Li 

If we want to extract 64bit value but ELEN < 64, we use RVV
vector mode with EEW = 32 to extract the highpart and lowpart.
However, this approach doesn't honor DFmode when movdf pattern
when ZVE32f and of course results in ICE when zve32f.

This patch would like to reuse the approach with some additional
handing, consider lowpart bits is meaningless for FP mode, we need
one int reg as bridge here. For example:

rtx tmp = gen_rtx_reg (DImode)
reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI
...
perform the extract for high and low parts
...
reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done

PR target/112743

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) and handle DFmode like DImode when EEW is
32bits like ZVE32F.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112743-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 67 +--
 .../gcc.target/riscv/rvv/base/pr112743-2.c| 52 ++
 2 files changed, 99 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..996347ee3fd 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2605,41 +2605,68 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
   unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 
1;
   scalar_mode smode = as_a (mode);
   unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size;
-  unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  unsigned int num = (smode == DImode || smode == DFmode)
+   && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  bool need_int_reg_p = false;
 
   if (num == 2)
{
  /* If we want to extract 64bit value but ELEN < 64,
 we use RVV vector mode with EEW = 32 to extract
 the highpart and lowpart.  */
+ need_int_reg_p = smode == DFmode;
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
 
-  for (unsigned int i = 0; i < num; i++)
+  opt_machine_mode opt_mode = riscv_vector::get_vector_mode (smode, 
nunits);
+
+  if (opt_mode.exists (&vmode))
{
- rtx result;
- if (num == 1)
-   result = dest;
- else if (i == 0)
-   result = gen_lowpart (smode, dest);
- else
-   result = gen_reg_rtx (smode);
- riscv_vector::emit_vec_extract (result, v, index + i);
+ rtx v = gen_lowpart (vmode, SUBREG_REG (src));
+ rtx int_reg = dest;
 
- if (i == 1)
+ if (need_int_reg_p)
{
- rtx tmp
-   = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
-   gen_int_mode (32, Pmode), NULL_RTX, 0,
-   OPTAB_DIRECT);
- rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-  OPTAB_DIRECT);
- emit_move_insn (dest, tmp2);
+ int_reg = gen_reg_rtx (DImode);
+ emit_insn (
+   gen_movdi (int_reg, gen_lowpart (GET_MODE (int_reg), dest)));
+   }
+
+ for (unsigned int i = 0; i < num; i++)
+   {
+ rtx result;
+ if (num == 1)
+   result = int_reg;
+ else if (i == 0)
+   result = gen_lowpart (smode, int_reg);
+ else
+   result = gen_reg_rtx (smode);
+
+ riscv_vector::emit_vec_extract (result, v, index + i);
+
+ if (i == 1)
+   {
+ rtx tmp = expand_binop (Pmode, ashl_optab,
+ gen_lowpart (Pmode, result),
+ gen_int_mode (32, Pmode), NULL_RTX, 0,
+ OPTAB_DIRECT);
+ rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg,
+  NULL_RTX, 0,
+  OPTAB_DIRECT);
+ emit_move_insn (int_reg, tmp2);
+   }
}
+
+ if (need_int_reg_p)
+   emit_insn (
+ gen_movdf (dest, gen_lowpart (GET_MODE (dest), int_reg)));
+ else
+   emit_move_insn (dest, int_reg);
}
+  else
+   gcc_unreachable ();
+
   return true;
 }
   /* Expand
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test 

[PATCH v3] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-12-01 Thread pan2 . li
From: Pan Li 

If we want to extract 64bit value but ELEN < 64, we use RVV
vector mode with EEW = 32 to extract the highpart and lowpart.
However, this approach doesn't honor DFmode when movdf pattern
when ZVE32f and of course results in ICE when zve32f.

This patch would like to reuse the approach with some additional
handing, consider lowpart bits is meaningless for FP mode, we need
one int reg as bridge here. For example:

rtx tmp = gen_rtx_reg (DImode)
reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI
...
perform the extract for high and low parts
...
reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done

PR target/112743

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) and handle DFmode like DImode when EEW is
32bits for ZVE32F.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112743-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 63 +--
 .../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
 2 files changed, 95 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..2fbaaf01078 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2605,41 +2605,64 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
   unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 
1;
   scalar_mode smode = as_a (mode);
   unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size;
-  unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  unsigned int num = (smode == DImode || smode == DFmode)
+   && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  bool need_int_reg_p = false;
 
   if (num == 2)
{
  /* If we want to extract 64bit value but ELEN < 64,
 we use RVV vector mode with EEW = 32 to extract
 the highpart and lowpart.  */
+ need_int_reg_p = smode == DFmode;
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
 
-  for (unsigned int i = 0; i < num; i++)
+  if (riscv_vector::get_vector_mode (smode, nunits).exists (&vmode))
{
- rtx result;
- if (num == 1)
-   result = dest;
- else if (i == 0)
-   result = gen_lowpart (smode, dest);
- else
-   result = gen_reg_rtx (smode);
- riscv_vector::emit_vec_extract (result, v, index + i);
+ rtx v = gen_lowpart (vmode, SUBREG_REG (src));
+ rtx int_reg = dest;
 
- if (i == 1)
+ if (need_int_reg_p)
{
- rtx tmp
-   = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
-   gen_int_mode (32, Pmode), NULL_RTX, 0,
-   OPTAB_DIRECT);
- rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-  OPTAB_DIRECT);
- emit_move_insn (dest, tmp2);
+ int_reg = gen_reg_rtx (DImode);
+ emit_move_insn (int_reg, gen_lowpart (GET_MODE (int_reg), dest));
}
+
+ for (unsigned int i = 0; i < num; i++)
+   {
+ rtx result;
+ if (num == 1)
+   result = int_reg;
+ else if (i == 0)
+   result = gen_lowpart (smode, int_reg);
+ else
+   result = gen_reg_rtx (smode);
+
+ riscv_vector::emit_vec_extract (result, v, index + i);
+
+ if (i == 1)
+   {
+ rtx tmp = expand_binop (Pmode, ashl_optab,
+ gen_lowpart (Pmode, result),
+ gen_int_mode (32, Pmode), NULL_RTX, 0,
+ OPTAB_DIRECT);
+ rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg,
+  NULL_RTX, 0,
+  OPTAB_DIRECT);
+ emit_move_insn (int_reg, tmp2);
+   }
+   }
+
+ if (need_int_reg_p)
+   emit_move_insn (dest, gen_lowpart (GET_MODE (dest), int_reg));
+ else
+   emit_move_insn (dest, int_reg);
}
+  else
+   gcc_unreachable ();
+
   return true;
 }
   /* Expand
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64

[PATCH v4] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-12-01 Thread pan2 . li
From: Pan Li 

If we want to extract 64bit value but ELEN < 64, we use RVV
vector mode with EEW = 32 to extract the highpart and lowpart.
However, this approach doesn't honor DFmode when movdf pattern
when ZVE32f and of course results in ICE when zve32f.

This patch would like to reuse the approach with some additional
handing, consider lowpart bits is meaningless for FP mode, we need
one int reg as bridge here. For example:

rtx tmp = gen_rtx_reg (DImode)
reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI
...
perform the extract for high and low parts
...
reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done

PR target/112743

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) and handle DFmode like DImode when EEW is
32bits for ZVE32F.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112743-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 63 +--
 .../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
 2 files changed, 95 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..84512dcdc68 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2605,41 +2605,64 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
   unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 
1;
   scalar_mode smode = as_a (mode);
   unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size;
-  unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  unsigned int num = known_eq (GET_MODE_SIZE (smode), 8)
+   && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  bool need_int_reg_p = false;
 
   if (num == 2)
{
  /* If we want to extract 64bit value but ELEN < 64,
 we use RVV vector mode with EEW = 32 to extract
 the highpart and lowpart.  */
+ need_int_reg_p = smode == DFmode;
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
 
-  for (unsigned int i = 0; i < num; i++)
+  if (riscv_vector::get_vector_mode (smode, nunits).exists (&vmode))
{
- rtx result;
- if (num == 1)
-   result = dest;
- else if (i == 0)
-   result = gen_lowpart (smode, dest);
- else
-   result = gen_reg_rtx (smode);
- riscv_vector::emit_vec_extract (result, v, index + i);
+ rtx v = gen_lowpart (vmode, SUBREG_REG (src));
+ rtx int_reg = dest;
 
- if (i == 1)
+ if (need_int_reg_p)
{
- rtx tmp
-   = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
-   gen_int_mode (32, Pmode), NULL_RTX, 0,
-   OPTAB_DIRECT);
- rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-  OPTAB_DIRECT);
- emit_move_insn (dest, tmp2);
+ int_reg = gen_reg_rtx (DImode);
+ emit_move_insn (int_reg, gen_lowpart (GET_MODE (int_reg), dest));
}
+
+ for (unsigned int i = 0; i < num; i++)
+   {
+ rtx result;
+ if (num == 1)
+   result = int_reg;
+ else if (i == 0)
+   result = gen_lowpart (smode, int_reg);
+ else
+   result = gen_reg_rtx (smode);
+
+ riscv_vector::emit_vec_extract (result, v, index + i);
+
+ if (i == 1)
+   {
+ rtx tmp = expand_binop (Pmode, ashl_optab,
+ gen_lowpart (Pmode, result),
+ gen_int_mode (32, Pmode), NULL_RTX, 0,
+ OPTAB_DIRECT);
+ rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg,
+  NULL_RTX, 0,
+  OPTAB_DIRECT);
+ emit_move_insn (int_reg, tmp2);
+   }
+   }
+
+ if (need_int_reg_p)
+   emit_move_insn (dest, gen_lowpart (GET_MODE (dest), int_reg));
+ else
+   emit_move_insn (dest, int_reg);
}
+  else
+   gcc_unreachable ();
+
   return true;
 }
   /* Expand
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g

[PATCH v1] RISC-V: Add test case for bug PR112813

2023-12-04 Thread pan2 . li
From: Pan Li 

The bugzilla 112813 has been fixed recently, add below test
case for the bug.

PR target/112813

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr112813-1.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/vsetvl/pr112813-1.c  | 32 +++
 1 file changed, 32 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112813-1.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112813-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112813-1.c
new file mode 100644
index 000..5aab9c2bf09
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112813-1.c
@@ -0,0 +1,32 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv_zvl256b -mabi=ilp32d -O3" } */
+
+int a, c, d, f, j;
+int b[7];
+long e;
+char *g;
+int *h;
+long long *i;
+
+void k() {
+  int l[][1] = {{}, {1}, {1}};
+  int *m = &d, *n = &l[0][0];
+
+  for (; e;)
+{
+  f = 3;
+
+  for (; f >= 0; f--)
+   {
+ *m &= b[f] >= 0;
+ j = a >= 2 ? 0 : 1 >> a;
+ *i |= j;
+}
+
+   for (; c;)
+ *g = 0;
+ }
+
+  h = n;
+}
-- 
2.34.1



[PATCH v1] RISC-V: Fix ICE for incorrect mode attr in V_F2DI_CONVERT_BRIDGE

2023-12-08 Thread pan2 . li
From: Pan Li 

The mode attr V_F2DI_CONVERT_BRIDGE converts the floating-point mode
to the widden floating-point by design. But we take (RVVM1HF "RVVM2SI") by
mistake.

This patch would like to fix it by replacing the
(RVVM1HF "RVVM2SI") to (RVVM1HF "RVVM2SF") as design.

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Replace RVVM2SI to RVVM2SF
for mode attr V_F2DI_CONVERT_BRIDGE.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c: New 
test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/vector-iterators.md   | 2 +-
 .../riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c | 7 +++
 2 files changed, 8 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 56080ed1f5f..5f5f7b5b986 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -3267,7 +3267,7 @@ (define_mode_attr v_f2di_convert [
 ])
 
 (define_mode_attr V_F2DI_CONVERT_BRIDGE [
-  (RVVM2HF "RVVM4SF") (RVVM1HF "RVVM2SI") (RVVMF2HF "RVVM1SF")
+  (RVVM2HF "RVVM4SF") (RVVM1HF "RVVM2SF") (RVVMF2HF "RVVM1SF")
   (RVVMF4HF "RVVMF2SF")
 
   (RVVM4SF "VOID") (RVVM2SF "VOID") (RVVM1SF "VOID")
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c
new file mode 100644
index 000..5fb61c7b44c
--- /dev/null
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-ice-1.c
@@ -0,0 +1,7 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "--param=riscv-autovec-lmul=m4 -march=rv64gcv_zvfh_zfh 
-mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math 
-fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "test-math.h"
+
+TEST_UNARY_CALL_CVT (_Float16, long, __builtin_lroundf16)
-- 
2.34.1



[PATCH v1] RISC-V: Disable RVV VCOMPRESS avl propagation

2023-12-12 Thread pan2 . li
From: Pan Li 

This patch would like to disable the avl propagation for the follow
reasons.

According to the ISA, the first vl elements of vector register
group vs2 should be extracted and packed for vcompress.  And the
highest element of vs2 vector may be touched by the mask, which
may be eliminated by avl propagation.

For example, given original vl = 4 here. We have:

  v0 = 0b1000
  v1 = {0x1, 0x2, 0x3, 0x4}
  v2 = {0x5, 0x6, 0x7, 0x8}

Then:
  vcompress v1, v2, v0 (avl = 4), v1 = {0x8, 0x2, 0x3, 0x4}. <== Correct.
  vcompress v1, v2, v0 (avl = 2), v1 will be unchanged.  <== Wrong.

Finally, we cannot propagate avl of vcompress because it may has
senmatics change to the result.

This patch also fix the failure of gcc.c-torture/execute/990128-1.c for
the following configurations.

riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax

gcc/ChangeLog:

* config/riscv/riscv-avlprop.cc (avl_can_be_propagated_p):
Disable the avl propogation for the vcompress.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-avlprop.cc | 35 --
 .../rvv/autovec/binop/vcompress-avlprop-1.c   | 36 +++
 2 files changed, 61 insertions(+), 10 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c

diff --git a/gcc/config/riscv/riscv-avlprop.cc 
b/gcc/config/riscv/riscv-avlprop.cc
index 02f006742f1..a6159816cf7 100644
--- a/gcc/config/riscv/riscv-avlprop.cc
+++ b/gcc/config/riscv/riscv-avlprop.cc
@@ -113,19 +113,34 @@ avl_can_be_propagated_p (rtx_insn *rinsn)
  touching the element with i > AVL.  So, we don't do AVL propagation
  on these following situations:
 
-   - The index of "vrgather dest, source, index" may pick up the
-element which has index >= AVL, so we can't strip the elements
-that has index >= AVL of source register.
-   - The last element of vslide1down is AVL + 1 according to RVV ISA:
-vstart <= i < vl-1vd[i] = vs2[i+1] if v0.mask[i] enabled
-   - The last multiple elements of vslidedown can be the element
-has index >= AVL according to RVV ISA:
-0 <= i+OFFSET < VLMAX   src[i] = vs2[i+OFFSET]
-vstart <= i < vl vd[i] = s

[PATCH v1] RISC-V: Support FP ceil to i/l/ll diff size autovec

2023-11-06 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP below API auto vectorization
with different type size

+-+---+--+
| API | RV64  | RV32 |
+-+---+--+
| iceil   | DF => SI  | DF => SI |
| iceilf  | - | -|
| lceil   | - | DF => SI |
| lceilf  | SF => DI  | -|
| llceil  | - | -|
| llceilf | SF => DI  | SF => DI |
+-+---+--+

Given below code:
void
test_lceilf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lceilf (in[i]);
}

Before this patch:
.L3:
  flw  fa0,0(s0)
  addi s0,s0,4
  addi s1,s1,8
  call ceilf
  fcvt.l.s a5,fa0,rtz
  sd   a5,-8(s1)
  bne  s2,s0,.L3
  ld   ra,24(sp)
  ld   s0,16(sp)
  ld   s1,8(sp)
  ld   s2,0(sp)
  addi sp,sp,32
  jr   ra

After this patch:
  fsrmi3  // RUP mode
.L3:
  vsetvli  a5,a2,e32,mf2,ta,ma
  vle32.v  v2,0(a1)
  slli a3,a5,2
  slli a4,a5,3
  vfwcvt.x.f.v v1,v2
  sub  a2,a2,a5
  vse64.v  v1,0(a0)
  add  a1,a1,a3
  add  a0,a0,a4
  bne  a2,zero,.L3

Unfortunately, the HF mode is not include due to it requires
additional middle-end support from internal-fun.def.

gcc/ChangeLog:

* config/riscv/autovec.md: Remove the size check of lceil.l
* config/riscv/riscv-v.cc (expand_vec_lceil):  Leverage
emit_vec_rounding_to_integer for ceil.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-iceil-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iceil-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llceilf-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llceilf-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iceil-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceil-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceilf-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llceilf-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   |  6 +-
 gcc/config/riscv/riscv-v.cc   |  8 +-
 .../riscv/rvv/autovec/unop/math-iceil-1.c | 18 
 .../riscv/rvv/autovec/unop/math-iceil-run-1.c | 83 ++
 .../rvv/autovec/unop/math-lceil-rv32-0.c  | 18 
 .../rvv/autovec/unop/math-lceil-rv32-run-0.c  | 83 ++
 .../rvv/autovec/unop/math-lceilf-rv64-0.c | 18 
 .../rvv/autovec/unop/math-lceilf-rv64-run-0.c | 84 +++
 .../riscv/rvv/autovec/unop/math-llceilf-0.c   | 19 +
 .../rvv/autovec/unop/math-llceilf-run-0.c | 84 +++
 .../riscv/rvv/autovec/vls/math-iceil-1.c  | 27 ++
 .../riscv/rvv/autovec/vls/math-lceil-rv32-0.c | 27 ++
 .../rvv/autovec/vls/math-lceilf-rv64-0.c  | 27 ++
 .../riscv/rvv/autovec/vls/math-llceilf-0.c| 27 ++
 14 files changed, 520 insertions(+), 9 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceilf-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceilf-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-iceil-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lceil-rv32-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lceilf-rv64-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llceilf-0.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 5b5105f5b46..b59bb880a45 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2466,8 +2466,7 @@ (define_expand "lround2"
 (define_expand "lceil2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
-&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE 
(mode))"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
 riscv_vect

[PATCH v1] ISC-V: Support FP floor to i/l/ll diff size autovec

2023-11-07 Thread pan2 . li
From: Pan Li 

This patch would like to support the FP below API auto vectorization
with different type size

+--+---+--+
| API  | RV64  | RV32 |
+--+---+--+
| ifloor   | DF => SI  | DF => SI |
| ifloorf  | - | -|
| lfloor   | - | DF => SI |
| lfloorf  | SF => DI  | -|
| llfloor  | - | -|
| llfloorf | SF => DI  | SF => DI |
+--+---+--+

Given below code:
void
test_lfloorf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lceilf (in[i]);
}

Before this patch:
.L3:
  flw  fa0,0(s0)
  addi s0,s0,4
  addi s1,s1,8
  call floorf
  fcvt.l.s a5,fa0,rtz
  sd   a5,-8(s1)
  bne  s2,s0,.L3

After this patch:
  fsrmi2  // RDN mode
.L3:
  vsetvli  a5,a2,e32,mf2,ta,ma
  vle32.v  v2,0(a1)
  slli a3,a5,2
  slli a4,a5,3
  vfwcvt.x.f.v v1,v2
  sub  a2,a2,a5
  vse64.v  v1,0(a0)
  add  a1,a1,a3
  add  a0,a0,a4
  bne  a2,zero,.L3

Unfortunately, the HF mode is not include due to it requires
additional middle-end support from internal-fun.def.

gcc/ChangeLog:

* config/riscv/autovec.md: Remove the size check of lfloor.
* config/riscv/riscv-v.cc (expand_vec_lfloor): Leverage
emit_vec_rounding_to_integer for floor.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-ifloor-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-rv32-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloorf-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloorf-rv64-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llfloorf-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llfloorf-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-ifloor-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lfloor-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lfloorf-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llfloorf-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   |  6 +-
 gcc/config/riscv/riscv-v.cc   |  8 +-
 .../riscv/rvv/autovec/unop/math-ifloor-1.c| 18 
 .../rvv/autovec/unop/math-ifloor-run-1.c  | 83 ++
 .../rvv/autovec/unop/math-lfloor-rv32-0.c | 18 
 .../rvv/autovec/unop/math-lfloor-rv32-run-0.c | 83 ++
 .../rvv/autovec/unop/math-lfloorf-rv64-0.c| 18 
 .../autovec/unop/math-lfloorf-rv64-run-0.c| 84 +++
 .../riscv/rvv/autovec/unop/math-llfloorf-0.c  | 19 +
 .../rvv/autovec/unop/math-llfloorf-run-0.c| 84 +++
 .../riscv/rvv/autovec/vls/math-ifloor-1.c | 27 ++
 .../rvv/autovec/vls/math-lfloor-rv32-0.c  | 27 ++
 .../rvv/autovec/vls/math-lfloorf-rv64-0.c | 27 ++
 .../riscv/rvv/autovec/vls/math-llfloorf-0.c   | 27 ++
 14 files changed, 520 insertions(+), 9 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-rv32-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloor-rv32-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloorf-rv64-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-lfloorf-rv64-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloorf-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloorf-run-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-ifloor-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lfloor-rv32-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-lfloorf-rv64-0.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llfloorf-0.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index b59bb880a45..973dc4ac235 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2486,8 +2486,7 @@ (define_expand "lceil2"
 (define_expand "lfloor2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
-&& known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE 
(mode))"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
 riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
 DONE;
@@ -2

[PATCH v2] DSE: Allow vector type for get_stored_val when read < store

2023-11-08 Thread pan2 . li
From: Pan Li 

Update in v2:
* Move vector type support to get_stored_val.

Original log:

This patch would like to allow the vector mode in the
get_stored_val in the DSE. It is valid for the read
rtx if and only if the read bitsize is less than the
stored bitsize.

Given below example code with
--param=riscv-autovec-preference=fixed-vlmax.

vuint8m1_t test () {
  uint8_t arr[32] = {
1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
  };

  return __riscv_vle8_v_u8m1(arr, 32);
}

Before this patch:
test:
  lui a5,%hi(.LANCHOR0)
  addisp,sp,-32
  addia5,a5,%lo(.LANCHOR0)
  li  a3,32
  vl2re64.v   v2,0(a5)
  vsetvli zero,a3,e8,m1,ta,ma
  vs2r.v  v2,0(sp) <== Unnecessary store to stack
  vle8.v  v1,0(sp) <== Ditto
  vs1r.v  v1,0(a0)
  addisp,sp,32
  jr  ra

After this patch:
test:
  lui a5,%hi(.LANCHOR0)
  addia5,a5,%lo(.LANCHOR0)
  li  a4,32
  addisp,sp,-32
  vsetvli zero,a4,e8,m1,ta,ma
  vle8.v  v1,0(a5)
  vs1r.v  v1,0(a0)
  addisp,sp,32
  jr  ra

Below tests are passed within this patch:

* The x86 bootstrap and regression test.
* The aarch64 regression test.
* The risc-v regression test.

PR target/111720

gcc/ChangeLog:

* dse.cc (get_stored_val): Allow vector mode if the read
bitsize is less than stored bitsize.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
* gcc.target/riscv/rvv/base/pr111720-9.c: New test.

Signed-off-by: Pan Li 
---
 gcc/dse.cc|  4 
 .../gcc.target/riscv/rvv/base/pr111720-0.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-1.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 
 .../gcc.target/riscv/rvv/base/pr111720-2.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-3.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-4.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-5.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-6.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-7.c| 21 +++
 .../gcc.target/riscv/rvv/base/pr111720-8.c| 18 
 .../gcc.target/riscv/rvv/base/pr111720-9.c| 15 +
 12 files changed, 202 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c

diff --git a/gcc/dse.cc b/gcc/dse.cc
index 1a85dae1f8c..21004becd4a 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1940,6 +1940,10 @@ get_stored_val (store_info *store_info, machine_mode 
read_mode,
   || GET_MODE_CLASS (read_mode) != GET_MODE_CLASS (store_mode)))
 read_reg = extract_low_bits (read_mode, store_mode,
 copy_rtx (store_info->const_rhs));
+  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
+&& known_lt (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
+&& targetm.modes_tieable_p (read_mode, store_mode))
+read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
   else
 read_reg = extract_low_bits (read_mode, store_mode,
 copy_rtx (store_info->rhs));
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
new file mode 100644
index 000..a61e94a6d98
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize 
--param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv

[PATCH v1] RISC-V: Refine frm emit after bb end in succ edges

2023-11-08 Thread pan2 . li
From: Pan Li 

This patch would like to fine the frm insn emit when we
meet abnormal edge in the loop. Conceptually, we only need
to emit once when abnormal instead of every iteration in
the loop.

This patch would like to fix this defect and only perform
insert_insn_end_basic_block when at least one succ edge is
abnormal.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_frm_emit_after_bb_end): Only
perform once emit when at least one succ edge is abnormal.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 08ff05dcc3f..e25692b86fc 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -9348,20 +9348,33 @@ static void
 riscv_frm_emit_after_bb_end (rtx_insn *cur_insn)
 {
   edge eg;
+  bool abnormal_edge_p = false;
   edge_iterator eg_iterator;
   basic_block bb = BLOCK_FOR_INSN (cur_insn);
 
   FOR_EACH_EDGE (eg, eg_iterator, bb->succs)
+{
+  if (eg->flags & EDGE_ABNORMAL)
+   abnormal_edge_p = true;
+  else
+   {
+ start_sequence ();
+ emit_insn (gen_frrmsi (DYNAMIC_FRM_RTL (cfun)));
+ rtx_insn *backup_insn = get_insns ();
+ end_sequence ();
+
+ insert_insn_on_edge (backup_insn, eg);
+   }
+}
+
+  if (abnormal_edge_p)
 {
   start_sequence ();
   emit_insn (gen_frrmsi (DYNAMIC_FRM_RTL (cfun)));
   rtx_insn *backup_insn = get_insns ();
   end_sequence ();
 
-  if (eg->flags & EDGE_ABNORMAL)
-   insert_insn_end_basic_block (backup_insn, bb);
-  else
-   insert_insn_on_edge (backup_insn, eg);
+  insert_insn_end_basic_block (backup_insn, bb);
 }
 
   commit_edge_insertions ();
-- 
2.34.1



[PATCH v1] Internal-fn: Add FLOATN support for l/ll round and rint [PR/112432]

2023-11-09 Thread pan2 . li
From: Pan Li 

The defined DEF_EXT_LIB_FLOATN_NX_BUILTINS functions should also
have DEF_INTERNAL_FLT_FLOATN_FN instead of DEF_INTERNAL_FLT_FN for
the FLOATN support. According to the glibc API and gcc builtin, we
have below table for the FLOATN is supported or not.

+-+---+-+
| | glibc | gcc: DEF_EXT_LIB_FLOATN_NX_BUILTINS |
+-+---+-+
| iceil   | N | N   |
| ifloor  | N | N   |
| irint   | N | N   |
| iround  | N | N   |
| lceil   | N | N   |
| lfloor  | N | N   |
| lrint   | Y | Y   |
| lround  | Y | Y   |
| llceil  | N | N   |
| llfllor | N | N   |
| llrint  | Y | Y   |
| llround | Y | Y   |
+-+---+-+

This patch would like to support FLOATN for:
1. lrint
2. lround
3. llrint
4. llround

The below tests are passed within this patch:
1. x86 bootstrap and regression test.
2. aarch64 regression test.
3. riscv regression tests.

PR target/112432

gcc/ChangeLog:

* internal-fn.def (LRINT): Add FLOATN support.
(LROUND): Ditto.
(LLRINT): Ditto.
(LLROUND): Ditto.

Signed-off-by: Pan Li 
---
 gcc/internal-fn.def | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 7f0e3759615..10f88e37bc9 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -365,12 +365,12 @@ DEF_INTERNAL_FLT_FN (IRINT, ECF_CONST, lrint, 
unary_convert)
 DEF_INTERNAL_FLT_FN (IROUND, ECF_CONST, lround, unary_convert)
 DEF_INTERNAL_FLT_FN (LCEIL, ECF_CONST, lceil, unary_convert)
 DEF_INTERNAL_FLT_FN (LFLOOR, ECF_CONST, lfloor, unary_convert)
-DEF_INTERNAL_FLT_FN (LRINT, ECF_CONST, lrint, unary_convert)
-DEF_INTERNAL_FLT_FN (LROUND, ECF_CONST, lround, unary_convert)
+DEF_INTERNAL_FLT_FLOATN_FN (LRINT, ECF_CONST, lrint, unary_convert)
+DEF_INTERNAL_FLT_FLOATN_FN (LROUND, ECF_CONST, lround, unary_convert)
 DEF_INTERNAL_FLT_FN (LLCEIL, ECF_CONST, lceil, unary_convert)
 DEF_INTERNAL_FLT_FN (LLFLOOR, ECF_CONST, lfloor, unary_convert)
-DEF_INTERNAL_FLT_FN (LLRINT, ECF_CONST, lrint, unary_convert)
-DEF_INTERNAL_FLT_FN (LLROUND, ECF_CONST, lround, unary_convert)
+DEF_INTERNAL_FLT_FLOATN_FN (LLRINT, ECF_CONST, lrint, unary_convert)
+DEF_INTERNAL_FLT_FLOATN_FN (LLROUND, ECF_CONST, lround, unary_convert)
 
 /* FP rounding.  */
 DEF_INTERNAL_FLT_FLOATN_FN (CEIL, ECF_CONST, ceil, unary)
-- 
2.34.1



[PATCH v1] RISC-V: Add HFmode for l/ll round and rint autovec

2023-11-10 Thread pan2 . li
From: Pan Li 

The internal-fn has support the FLOATN already. This patch
would like to re-enable the vector HFmode for the autovec for
below standard name mode iterators.

1. lrint
2. llround

For now the vector HFmodes are disabled to limit the impact,
and the underlying FP16 rint/round autovec will enable this
one by one.

gcc/ChangeLog:

* config/riscv/autovec.md: Disable vector HFmode for
rint, round, ceil and floor.
* config/riscv/vector-iterators.md: Add vector HFmode
for rint, round, ceil and floor mode iterator.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md  | 26 +++-
 gcc/config/riscv/vector-iterators.md | 59 +++-
 2 files changed, 73 insertions(+), 12 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 33722ea1139..a199caabf87 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2443,12 +2443,11 @@ (define_expand "roundeven2"
   }
 )
 
-;; Add mode_size equal check as we opened the modes for different sizes.
-;; The check will be removed soon after related codegen implemented
 (define_expand "lrint2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& GET_MODE_INNER (mode) != HFmode"
   {
 riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
@@ -2458,7 +2457,8 @@ (define_expand "lrint2"
 (define_expand "lrint2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& GET_MODE_INNER (mode) != HFmode"
   {
 riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
@@ -2468,7 +2468,8 @@ (define_expand "lrint2"
 (define_expand "lround2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& GET_MODE_INNER (mode) != HFmode"
   {
 riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
 DONE;
@@ -2478,7 +2479,8 @@ (define_expand "lround2"
 (define_expand "lround2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& GET_MODE_INNER (mode) != HFmode"
   {
 riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
 DONE;
@@ -2488,7 +2490,8 @@ (define_expand "lround2"
 (define_expand "lceil2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& GET_MODE_INNER (mode) != HFmode"
   {
 riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
 DONE;
@@ -2498,7 +2501,8 @@ (define_expand "lceil2"
 (define_expand "lceil2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& GET_MODE_INNER (mode) != HFmode"
   {
 riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
 DONE;
@@ -2508,7 +2512,8 @@ (define_expand "lceil2"
 (define_expand "lfloor2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& GET_MODE_INNER (mode) != HFmode"
   {
 riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
 DONE;
@@ -2518,7 +2523,8 @@ (define_expand "lfloor2"
 (define_expand "lfloor2"
   [(match_operand:   0 "register_operand")
(match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
-  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math
+&& GET_MODE_INNER (mode) != HFmode"
   {
 riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
 DONE;
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index e80eaedc4b3..f2d9f60b631 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -3221,15 +3221,20 @@ (define_mode_attr vnnconvert [
 ;; V_F2SI_CONVERT: (HF, SF, DF) => SI
 ;; V_F2DI_CONVERT: (HF, SF, DF) => DI
 ;;
-;; HF requires addit

  1   2   3   4   5   6   >