[PATCH v2] RISC-V: Support {U}INT64 to FP16 auto-vectorization

2023-09-28 Thread pan2 . li
From: Pan Li Update in v2: * Add math trap check. * Adjust some test cases. Original logs: This patch would like to support the auto-vectorization from the INT64 to FP16. We take below steps for the conversion. * INT64 to FP32. * FP32 to FP16. Given sample code as below: void test_func (int6

[PATCH v1] RISC-V: Update comments for FP rounding related autovec

2023-10-05 Thread pan2 . li
From: Pan Li Some comment is out of date, this patch would like to fix it. gcc/ChangeLog: * config/riscv/autovec.md: Update comments. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/autove

[PATCH v1] RISC-V: Bugfix for legitimize address PR/111634

2023-10-06 Thread pan2 . li
From: Pan Li Given we have RTL as below. (plus:DI (mult:DI (reg:DI 138 [ g.4_6 ]) (const_int 8 [0x8])) (lo_sum:DI (reg:DI 167) (symbol_ref:DI ("f") [flags 0x86] ) )) When handling (plus (plus (mult (a) (mem_shadd_constant)) (fp)) (C)) case, the fp

[PATCH v1] RISC-V: Add more run test for FP rounding autovec

2023-10-06 Thread pan2 . li
From: Pan Li For _Float16 types, add run test for: * ceil * floor * nearbyint * rint * round * roundeven * trunc For float and double, add run test for: * roundeven The zfa extension is required for these run test cases, the simulation target_board may look like below for rv64. target_board="r

[PATCH v1] RISC-V: Refine bswap16 auto vectorization code gen

2023-10-09 Thread pan2 . li
From: Pan Li This patch would like to refine the code gen for the bswap16. We will have VEC_PERM_EXPR after rtl expand when invoking __builtin_bswap. It will generate about 9 instructions in loop as below, no matter it is bswap16, bswap32 or bswap64. .L2: 1 vle16.v v4,0(a0) 2 vmv.v.x v2,a7 3

[PATCH v2] RISC-V: Refine bswap16 auto vectorization code gen

2023-10-09 Thread pan2 . li
From: Pan Li Update in v2 * Remove emit helper functions. * Take expand_binop instead. Original log: This patch would like to refine the code gen for the bswap16. We will have VEC_PERM_EXPR after rtl expand when invoking __builtin_bswap. It will generate about 9 instructions in loop as below,

[PATCH v1] RISC-V: Support FP lrint/lrintf auto vectorization

2023-10-11 Thread pan2 . li
From: Pan Li This patch would like to support the FP lrint/lrintf auto vectorization. * long lrint (double) for rv64 * long lrintf (float) for rv32 Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lrintmn2 only act on DF => DI for rv64,

[PATCH v1] RISC-V: Support FP irintf auto vectorization

2023-10-11 Thread pan2 . li
From: Pan Li This patch would like to support the FP irintf auto vectorization. * int irintf (float) Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lrintmn2 only act on SF => SI. Given we have code like: void test_irintf (int *out, f

[PATCH v1] RISC-V: Support FP llrint auto vectorization

2023-10-11 Thread pan2 . li
From: Pan Li This patch would like to support the FP llrint auto vectorization. * long long llrint (double) This will be the CVT from DF => DI from the standard name's perpsective, which has been covered in previous PATCH(es). Thus, this patch only add some test cases. gcc/testsuite/ChangeLog:

[PATCH v1] RISC-V: Support FP lround/lroundf auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li This patch would like to support the FP lround/lroundf auto vectorization. * long lround (double) for rv64 * long lroundf (float) for rv32 Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lroundmn2 only act on DF => DI for r

[PATCH v1] RISC-V: Support FP lceil/lceilf auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li This patch would like to support the FP lceil/lceilf auto vectorization. * long lceil (double) for rv64 * long lceilf (float) for rv32 Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lceilmn2 only act on DF => DI for rv64,

[PATCH v1] RISC-V: Support FP lfloor/lfloorf auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li This patch would like to support the FP lfloor/lfloorf auto vectorization. * long lfloor (double) for rv64 * long lfloorf (float) for rv32 Due to the limitation that only the same size of data type are allowed in the vectorier, the standard name lfloormn2 only act on DF => DI for r

[PATCH v1] RISC-V: Leverage stdint-gcc.h for RVV test cases

2023-10-12 Thread pan2 . li
From: Pan Li Leverage stdint-gcc.h for the int64_t types instead of typedef. Or we may have conflict with stdint-gcc.h in somewhere else. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: Include stdint-gcc.h for int types. * gcc.target/riscv/

[PATCH v1] RISC-V: Add test for FP iroundf auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. int iroundf (float); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-iround-0

[PATCH v1] RISC-V: Add test for FP llround auto vectorization

2023-10-12 Thread pan2 . li
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. long long llround (double); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-l

[PATCH v1] RISC-V: Add test for FP llceil auto vectorization

2023-10-13 Thread pan2 . li
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. long long llceil (double); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-ll

[PATCH v1] RISC-V: Add test for FP iceil auto vectorization

2023-10-13 Thread pan2 . li
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. int iceil (float); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c:

[PATCH v1] RISC-V: Add test for FP ifloor auto vectorization

2023-10-13 Thread pan2 . li
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. int ifloor (float); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.

[PATCH v1] RISC-V: Add test for FP llfloor auto vectorization

2023-10-13 Thread pan2 . li
From: Pan Li The below FP API are supported already by sharing the same standard name, as well as the machine mode. long long llfloor (double); This patch would like to add the test cases for ensuring the correctness. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-l

[PATCH v1] RISC-V: Refine run test cases of math autovec

2023-10-13 Thread pan2 . li
From: Pan Li For the run test cases of math autovec, we need a reference value to check if the return value is expected or not. The previous patch leverage hardcode for the reference value but we can leverage the scalar math function instead. For example ceil after autovec. ASSERT (CEIL (Vector

[PATCH v1] RISC-V: Remove the type size restriction of vectorizer

2023-10-17 Thread pan2 . li
From: Pan Li The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++)

[PATCH v1] RISC-V: Bugfix for merging undefined tmp register in math

2023-10-22 Thread pan2 . li
From: Pan Li For math function autovec, there will be one step like rtx tmp = gen_reg_rtx (vec_int_mode); emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode); The MU will leave the tmp (aka dest register) register unmasked elements unchanged and it is undefined here. This pat

[PATCH v1] RISC-V: Remove unnecessary asm check for rounding autovec

2023-10-22 Thread pan2 . li
From: Pan Li The vsetvl asm check is unnecessary for the rounding function autovec. These rounding test cases should focus on the rounding insn sequence. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: Remove the vsetvl check. * gcc.target/riscv

[PATCH v1] RISC-V: Remove unnecessary asm check for binop constraint

2023-10-22 Thread pan2 . li
From: Pan Li The vsetvl asm check is unnecessary for the binop constraint. We should be focus for constrait and leave the vsetvl test to the vsetvl pass. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/binop_vv_constraint-1.c: Remove the vsetvl asm check from func body.

[PATCH v1] RISC-V: Bugfix for merging undef tmp register for trunc

2023-10-23 Thread pan2 . li
From: Pan Li For trunc function autovec, there will be one step like below take MU for the merge operand. rtx tmp = gen_reg_rtx (vec_int_mode); emit_vec_cvt_x_f_rtz (tmp, op_1, mask, vec_fp_mode); The MU will leave the tmp (aka dest register) register unmasked elements unchanged and it is undef

[PATCH v1] RISC-V: Remove unnecessary asm check for vec cvt

2023-10-23 Thread pan2 . li
From: Pan Li The vsetvl asm check is unnecessary for the vector convert. We should be focus for constrait and leave the vsetvl test to the vsetvl pass. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/cvt-0.c: Remove the vsetvl asm check from func body. * gcc

[PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-25 Thread pan2 . li
From: Pan Li Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize fu

[PATCH v1] RISC-V: Fix one range-loop-construct warning of avlprop

2023-10-28 Thread pan2 . li
From: Pan Li This patch would like to fix one warning of avlprop as below. ../../gcc/config/riscv/riscv-avlprop.cc: In member function 'virtual unsigned int pass_avlprop::execute(function*)': ../../gcc/config/riscv/riscv-avlprop.cc:346:23: error: loop variable 'candidate' creates a copy from typ

[PATCH v3] VECT: Refine the type size restriction of call vectorizer

2023-10-30 Thread pan2 . li
From: Pan Li Update in v3: * Add func to predicate type size is legal or not for vectorizer call. Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The vectoriable_call has one restriction of the size of data type. Aka DF

[PATCH v4] VECT: Refine the type size restriction of call vectorizer

2023-10-31 Thread pan2 . li
From: Pan Li Update in v4: * Append the check to vectorizable_internal_function. Update in v3: * Add func to predicate type size is legal or not for vectorizer call. Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The

[PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]

2023-11-01 Thread pan2 . li
From: Pan Li The extract_low_bits only try the scalar mode if the bitsize of the mode and src_mode is not equal. When vector mode is given from get_stored_val in DSE, it will always fail and return NULL_RTX. This patch would like to allow the vector mode in the extract_low_bits if and only if th

[PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

2023-11-02 Thread pan2 . li
From: Pan Li The previous rounding API start with i/l/ll only works on the same mode types. For example as below, and we arrange the iterator similar to fcvt. * SF => SI * DF => DI After we refined this limination from middle-end, these API can also vectorized with different type sizes, aka: *

[PATCH v2] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

2023-11-02 Thread pan2 . li
From: Pan Li Update in v2: * Add mode size equal check to disable different mode size when expand, because the underlying codegen is not implemented yet. Original log: The previous rounding API start with i/l/ll only works on the same mode types. For example as below, and we arrange the iter

[PATCH v1] RISC-V: Remove HF modes of FP to INT rounding autovec

2023-11-03 Thread pan2 . li
From: Pan Li The [i|l|ll][rint|round|ceil|floor] internal functions are defined as DEF_INTERNAL_FLT_FN instead of DEF_INTERNAL_FLT_FLOATN_FN. Then the *f16 (N=16 of FLOATN) format of these functions are not available when try to get the ifn from the given cfn in the vectorizable_call. Aka: BUILT

[PATCH v1] RISC-V: Support FP rint to i/l/ll diff size autovec

2023-11-05 Thread pan2 . li
From: Pan Li This patch would like to support the FP below API auto vectorization with different type size +-+---+--+ | API | RV64 | RV32 | +-+---+--+ | irint | DF => SI | DF => SI | | irintf | - | -| | lrint | -

[PATCH v1] RISC-V: Adjust FP rint round tests for RV32

2023-11-06 Thread pan2 . li
From: Pan Li The FP rint test cases for RV32 need some additional adjust for types and data. This patch would like to fix this which is missed in FP rint support PATCH for RV32 only by mistake. Please note the math-llrintf-run-0.c will trigger one ICE in the vsetvl pass in RV32 only. ./riscv32-

[PATCH v1] RISC-V: Support FP round to i/l/ll diff size autovec

2023-11-06 Thread pan2 . li
From: Pan Li This patch would like to support the FP below API auto vectorization with different type size +--+---+--+ | API | RV64 | RV32 | +--+---+--+ | iround | DF => SI | DF => SI | | iroundf | - | -| | lround

[PATCH v1] RISC-V: Cleanup unused code in riscv_v_adjust_bytesize [NFC]

2024-03-05 Thread pan2 . li
From: Pan Li Cleanup mode_size related code which is not used anymore. Below tests are passed for this patch. * The RVV fully regresssion test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_adjust_bytesize): Cleanup unused mode_size related code. Signed-off-by: Pan Li ---

[PATCH v1] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-05 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only if the mrvv-vector-bits=zvl, the only one args should be the integer constant and its' val

[PATCH v2] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-05 Thread pan2 . li
From: Pan Li Update in v2: * Cleanup some unused code. * Fix some typo of commit log. Original log: This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only i

[PATCH v1] VECT: Bugfix ICE for vectorizable_store when both len and mask

2024-03-07 Thread pan2 . li
From: Pan Li This patch would like to fix one ICE in vectorizable_store for both the loop_masks and loop_lens. The ICE looks like below with "-march=rv64gcv -O3". during GIMPLE pass: vect test.c: In function ‘d’: test.c:6:6: internal compiler error: in vectorizable_store, at tree-vect-stmts.cc:

[PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled

2024-03-09 Thread pan2 . li
From: Pan Li This patch would like to fix one ICE in vectorizable_store when both the loop_masks and loop_lens are enabled. The ICE looks like below when build with "-march=rv64gcv -O3". during GIMPLE pass: vect test.c: In function ‘d’: test.c:6:6: internal compiler error: in vectorizable_store

[PATCH v3] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-11 Thread pan2 . li
From: Pan Li Update in v3: * Add pre-defined __riscv_v_fixed_vlen when zvl. Update in v2: * Cleanup some unused code. * Fix some typo of commit log. Original log: This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one exi

[PATCH v1] RISC-V: Fix some code style issue(s) in riscv-c.cc [NFC]

2024-03-12 Thread pan2 . li
From: Pan Li Notice some code style issue(s) when add __riscv_v_fixed_vlen, includes: * Meanless empty line. * Line greater than 80 chars. * Indent with 3 space(s). * Argument unalignment. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_ext_version_value): Fix code style greate

[PATCH v1] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))

2024-03-17 Thread pan2 . li
From: Pan Li This patch would like to fix one ICE for __attribute__((target("arch=+v")) and likewise extension(s). Given we have sample code as below: void __attribute__((target("arch=+v"))) test_2 (int *a, int *b, int *out, unsigned count) { unsigned i; for (i = 0; i < count; i++) out[i]

[PATCH v1] RISC-V: Bugfix function target attribute pollution

2024-03-19 Thread pan2 . li
From: Pan Li This patch depends on below ICE fix. https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647915.html The function target attribute should be on a per-function basis. For example, we have 3 function as below: void test_1 () {} void __attribute__((target("arch=+v"))) test_2 () {}

[PATCH v2] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))

2024-03-21 Thread pan2 . li
From: Pan Li This patch would like to fix one ICE for __attribute__((target("arch=+v")) and likewise extension(s). Given we have sample code as below: void __attribute__((target("arch=+v"))) test_2 (int *a, int *b, int *out, unsigned count) { unsigned i; for (i = 0; i < count; i++) out[i]

[PATCH v4] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-21 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only if the mrvv-vector-bits=zvl, the only one args should be the integer constant and its' val

[PATCH v1] RISC-V: Allow RVV intrinsic when function target("arch=+v")

2024-03-25 Thread pan2 . li
From: Pan Li This patch would like to allow the RVV intrinsic when function is attributed as target("arch=+v") and build with rv64gc. For example: vint32m1_t __attribute__((target("arch=+v"))) test_1 (vint32m1_t a, vint32m1_t b, size_t vl) { return __riscv_vadd_vv_i32m1 (a, b, vl); } build w

[PATCH v1] RISC-V: Allow RVV intrinsic for more function target

2024-03-26 Thread pan2 . li
From: Pan Li In previous, we allowed the target(("arch=+v")) for a function with rv64gc build. This patch would like to support more arch options as below: * zve32x * zve32f * zve64x * zve64f * zve64d * zvfhmin * zvfh For example, we have sample code as below. vfloat32m1_t __attribute__((target

[PATCH] RISC-V: Fix misspelled term builtin in error message

2024-03-30 Thread pan2 . li
From: Pan Li This patch would like to fix below misspelled term in error message. ../../gcc/config/riscv/riscv-vector-builtins.cc:4592:16: error: misspelled term 'builtin function' in format; use 'built-in function' instead [-Werror=format-diag] 4592 | "builtin function %qE requi

[PATCH] RISC-V: Fix one unused varable in riscv_subset_list::parse

2024-03-30 Thread pan2 . li
From: Pan Li This patch would like to fix one unused variable as below: ../../gcc/common/config/riscv/riscv-common.cc: In static member function 'static riscv_subset_list* riscv_subset_list::parse(const char*, location_t)': ../../gcc/common/config/riscv/riscv-common.cc:1501:19: error: unused var

[PATCH v1] Internal-fn: Introduce new internal function SAT_ADD

2024-04-06 Thread pan2 . li
From: Pan Li This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_AD

[PATCH v2] Internal-fn: Introduce new internal function SAT_ADD

2024-04-07 Thread pan2 . li
From: Pan Li Update in v2: * Fix one failure for x86 bootstrap. Original log: This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((T

[PATCH v1] RISC-V: Refine the error msg for RVV intrinisc required ext

2024-04-08 Thread pan2 . li
From: Pan Li The RVV intrinisc API has sorts of required extension from both the march or target attribute. It will have error message similar to below: built-in function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension However, it is not accurate as we have many additional sub extenst

[PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch

2024-04-10 Thread pan2 . li
From: Pan Li This patch would like to fix a ICE in mode sw for below example code. during RTL pass: mode_sw test.c: In function ‘vbool16_t j(vuint64m4_t)’: test.c:15:1: internal compiler error: in create_pre_exit, at mode-switching.cc:451 15 | } | ^ 0x3978f12 create_pre_exit __R

[PATCH v1] RISC-V: Remove -Wno-psabi for test build option [NFC]

2024-04-10 Thread pan2 . li
From: Pan Li Just notice there are some test case still have -Wno-psabi option, which is deprecated now. Remove them all for riscv test cases. The below test are passed for this patch. * The riscv rvv regression test. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/pr109244.C: Re

[PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty args

2024-02-06 Thread pan2 . li
From: Pan Li There is one corn case when similar as below example: void test (void) { __riscv_vfredosum_tu (); } It will meet ICE because of the implement details of overloaded function in gcc. According to the rvv intrinisc doc, we have no such overloaded function with empty args. Unfortun

[PATCH v1] RISC-V: Bugfix for RVV overloaded intrinsic ICE in function checker

2024-02-07 Thread pan2 . li
From: Pan Li There is another corn case when similar as below example: void test (void) { __riscv_vaadd (); } We report error when overloaded function with empty args. For example: test.c: In function 'foo': test.c:8:3: error: no matching function call to '__riscv_vaadd' with empty args

[PATCH v1] RISC-V: Fix misspelled term args in error_at message

2024-02-10 Thread pan2 . li
From: Pan Li When build with "-Werror=format-diag", there will be one misspelled term args as below. This patch would like fix it by taking the term arguments instead. ../../gcc/config/riscv/riscv-vector-builtins.cc: In function 'tree_node* riscv_vector::resolve_overloaded_builtin(location_t, un

[PATCH v1] Internal-fn: Add new internal function SAT_ADDU

2024-02-17 Thread pan2 . li
From: Pan Li This patch would like to add the middle-end presentation for the unsigned saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADDU (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have

[PATCH v1] RISC-V: Upgrade RVV intrinsic version to 0.12

2024-02-20 Thread pan2 . li
From: Pan Li Upgrade the version of RVV intrinsic from 0.11 to 0.12. PR target/114017 gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Upgrade the version to 0.12. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-__riscv_v_intrinsic.c: Upda

[PATCH v1] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-23 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * 64 * 128 * 256 * 512 * 1024 * 2048 * 4096 * 8192 * 16384 * 32768 * 65536 * scalable * zvl 1. The scalable will be the d

[PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-02-24 Thread pan2 . li
From: Pan Li Hi Richard & Tamar, Try the DEF_INTERNAL_INT_EXT_FN as your suggestion. By mapping us_plus$a3 to the RTL representation (us_plus:m x y) in optabs.def. And then expand_US_PLUS in internal-fn.cc. Not very sure if my understanding is correct for DEF_INTERNAL_INT_EXT_FN. I am not sur

[PATCH v1] RTL: Bugfix ICE after allow vector type in DSE

2024-02-25 Thread pan2 . li
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, we missed to adjust the validate_subreg part accordingly. For vector type, we don't need to restrict the mode size is greater than the vector register size. Thus, for exa

[PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-02-26 Thread pan2 . li
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, we missed to adjust the validate_subreg part accordingly. When the vector type's size is less than vector register, it will be considered as invalid in the validate_subreg

[PATCH v2] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-27 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * zvl The zvl will pick up the zvl*b from the march option. For example, the mrvv-vector-bits will be 1024 when march=rv6

[PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * scalable * zvl The scalable will pick up the zvl*b in the march as the minimal vlen. For example, the minimal vlen will

[PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * scalable * zvl The scalable will pick up the zvl*b in the march as the minimal vlen. For example, the minimal vlen will

[PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

2024-01-10 Thread pan2 . li
From: Pan Li The insert_var_expansion_initialization depends on the HONOR_SIGNED_ZEROS to initialize the unrolling variables to +0.0f when -0.0f and no-signed-option. Unfortunately, we should always keep the -0.0f here because: * The -0.0f is always the correct initial value. * We need to suppo

[PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

2024-01-11 Thread pan2 . li
From: Pan Li The insert_var_expansion_initialization depends on the HONOR_SIGNED_ZEROS to initialize the unrolling variables to +0.0f when -0.0f and no-signed-option. Unfortunately, we should always keep the -0.0f here because: * The -0.0f is always the correct initial value. * We need to suppo

[PATCH v1] RISC-V: Update the comments of riscv_v_ext_mode_p [NFC]

2024-01-11 Thread pan2 . li
From: Pan Li gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_ext_mode_p): Update the comments of predicate func riscv_v_ext_mode_p. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/risc

[PATCH v1] RISC-V: Fix asm checks regression due to recent middle-end change

2024-01-17 Thread pan2 . li
From: Pan Li The recent middle-end change result in some asm check failures. This patch would like to fix the asm check by adjust the times. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/shift-1.c: Fix asm check count. * gcc.target/riscv/rvv/autovec/vls/shi

[PATCH v1] RISC-V: Bugfix for vls integer mode calling convention

2024-01-23 Thread pan2 . li
From: Pan Li According to the issue as below. https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/416 When the mode size of vls integer mode is less than 2 * XLEN, we will take the gpr/fpr for both the args and the return values. Instead of the reference. For example the below code: typ

[PATCH v2] RISC-V: Bugfix for vls mode aggregated in GPR calling convention

2024-01-30 Thread pan2 . li
From: Pan Li According to the issue as below. https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/416 When the mode size of vls integer mode is less than 2 * XLEN, we will take the gpr for both the args and the return values. Instead of the reference. For example the below code: typedef

[PATCH v1] RISC-V: Cleanup the comments for the psabi

2024-01-30 Thread pan2 . li
From: Pan Li This patch would like to cleanup some comments which are out of date or incorrect. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_get_arg_info): Cleanup comments. (riscv_pass_by_reference): Ditto. (riscv_fntype_abi): Ditto. Signed-off-by: Pan Li --- gcc/c

[PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread pan2 . li
From: Pan Li Refine the test cases for: * Name convention. * Add run case. PR target/112929 PR target/112988 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr112929.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr112929-1.c: ...here. * gcc.target

[PATCH v1] RISC-V: Fix POLY INT handle bug

2023-12-17 Thread pan2 . li
From: Pan Li This patch fixes the following FAIL: Running target riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 FAIL: gcc.dg/vect/fast-math-vect-complex-3.c execution test The root cause is we generate incorrect codegen for (const_poly_int:DI [549755813888, 54

[PATCH v1] RISC-V: Bugfix for the RVV const vector

2023-12-17 Thread pan2 . li
From: Pan Li This patch would like to fix one bug of const vector for interleave. Assume we need to generate interleave const vector like below. V = {{4, -4, 3, -3, 2, -2, 1, -1,} Before this patch: vsetvl a3, zero, e64, m8, ta, ma vid.v v8v8 = {0, 1, 2, 3, 4} li a6

[PATCH v2] RISC-V: Bugfix for the RVV const vector

2023-12-17 Thread pan2 . li
From: Pan Li This patch would like to fix one bug of const vector for interleave. Assume we need to generate interleave const vector like below. V = {{4, -4, 3, -3, 2, -2, 1, -1,} Before this patch: vsetvl a3, zero, e64, m8, ta, ma vid.v v8v8 = {0, 1, 2, 3, 4} li a6

[PATCH v1] RISC-V: Bugfix for the const vector in single steps

2023-12-19 Thread pan2 . li
From: Pan Li For generating the const vector with single step, we have code gen similar as below. We have npatterns = 4. v1= {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... } v2 (diff) = {3 - 0, 2 - 1, 1 - 2, 0 - 3, 7 - 4, 6 - 5, 5 - 6, 4 - 7...} = {3, 1, -1, 3, 3, 1, -1, 3 ...} v1 = vd +

[PATCH v2] RISC-V: Bugfix for the const vector in single steps

2023-12-19 Thread pan2 . li
From: Pan Li This patch would like to fix the below execution failure. FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test The will be one single step const vector like { -4, 4, -3, 5, -2, 6, -1, 7, ...}. For such const vector generation with single step, we will generate vid +

[PATCH v3] RISC-V: Bugfix for the const vector in single steps

2023-12-20 Thread pan2 . li
From: Pan Li This patch would like to fix the below execution failure when build with "-march=rv64gcv_zvl512b -mabi=lp64d -mcmodel=medlow --param=riscv-autovec-lmul=m8 -ftree-vectorize -fno-vect-cost-model -O3" FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test The will be one

[PATCH v1] RISC-V: XFail the signbit-5 run test for RVV

2023-12-20 Thread pan2 . li
From: Pan Li This patch would like to XFail the signbit-5 run test case for the RVV. Given the case has one limitation like "This test does not work when the truth type does not match vector type." in the beginning of the test file. Aka, the RVV vector truth type is not integer type. The targe

[PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-23 Thread pan2 . li
From: Pan Li This patch would like to XFAIL the test case pr30957-1.c for the RVV when build the elf with some configurations (list at the end of the log) It will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as

[PATCH v2] RISC-V: XFail the signbit-5 run test for RVV

2023-12-23 Thread pan2 . li
From: Pan Li This patch would like to XFail the signbit-5 run test case for the RVV. Given the case has one limitation like "This test does not work when the truth type does not match vector type." in the beginning of the test file. Aka, the RVV vector truth type is not integer type. The targe

[PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-26 Thread pan2 . li
From: Pan Li This patch would like to XFAIL the test case pr30957-1.c for the RVV when build the elf with some configurations (list at the end of the log) It will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as

[PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option

2024-01-02 Thread pan2 . li
From: Pan Li According to the sematics of no-signed-zeros option, the backend like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. Consider below example with option -fno-signed-zeros. void test (float *a) { *a = -0.0; } We will generate code as below, which doesn't treat the min

[PATCH v2] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-11-30 Thread pan2 . li
From: Pan Li If we want to extract 64bit value but ELEN < 64, we use RVV vector mode with EEW = 32 to extract the highpart and lowpart. However, this approach doesn't honor DFmode when movdf pattern when ZVE32f and of course results in ICE when zve32f. This patch would like to reuse the approach

[PATCH v3] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-12-01 Thread pan2 . li
From: Pan Li If we want to extract 64bit value but ELEN < 64, we use RVV vector mode with EEW = 32 to extract the highpart and lowpart. However, this approach doesn't honor DFmode when movdf pattern when ZVE32f and of course results in ICE when zve32f. This patch would like to reuse the approach

[PATCH v4] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-12-01 Thread pan2 . li
From: Pan Li If we want to extract 64bit value but ELEN < 64, we use RVV vector mode with EEW = 32 to extract the highpart and lowpart. However, this approach doesn't honor DFmode when movdf pattern when ZVE32f and of course results in ICE when zve32f. This patch would like to reuse the approach

[PATCH v1] RISC-V: Add test case for bug PR112813

2023-12-04 Thread pan2 . li
From: Pan Li The bugzilla 112813 has been fixed recently, add below test case for the bug. PR target/112813 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr112813-1.c: New test. Signed-off-by: Pan Li --- .../gcc.target/riscv/rvv/vsetvl/pr112813-1.c | 32 +++

[PATCH v1] RISC-V: Fix ICE for incorrect mode attr in V_F2DI_CONVERT_BRIDGE

2023-12-08 Thread pan2 . li
From: Pan Li The mode attr V_F2DI_CONVERT_BRIDGE converts the floating-point mode to the widden floating-point by design. But we take (RVVM1HF "RVVM2SI") by mistake. This patch would like to fix it by replacing the (RVVM1HF "RVVM2SI") to (RVVM1HF "RVVM2SF") as design. gcc/ChangeLog: *

[PATCH v1] RISC-V: Disable RVV VCOMPRESS avl propagation

2023-12-12 Thread pan2 . li
From: Pan Li This patch would like to disable the avl propagation for the follow reasons. According to the ISA, the first vl elements of vector register group vs2 should be extracted and packed for vcompress. And the highest element of vs2 vector may be touched by the mask, which may be elimina

[PATCH v1] RISC-V: Support FP ceil to i/l/ll diff size autovec

2023-11-06 Thread pan2 . li
From: Pan Li This patch would like to support the FP below API auto vectorization with different type size +-+---+--+ | API | RV64 | RV32 | +-+---+--+ | iceil | DF => SI | DF => SI | | iceilf | - | -| | lceil | -

[PATCH v1] ISC-V: Support FP floor to i/l/ll diff size autovec

2023-11-07 Thread pan2 . li
From: Pan Li This patch would like to support the FP below API auto vectorization with different type size +--+---+--+ | API | RV64 | RV32 | +--+---+--+ | ifloor | DF => SI | DF => SI | | ifloorf | - | -| | lfloor

[PATCH v2] DSE: Allow vector type for get_stored_val when read < store

2023-11-08 Thread pan2 . li
From: Pan Li Update in v2: * Move vector type support to get_stored_val. Original log: This patch would like to allow the vector mode in the get_stored_val in the DSE. It is valid for the read rtx if and only if the read bitsize is less than the stored bitsize. Given below example code with --

[PATCH v1] RISC-V: Refine frm emit after bb end in succ edges

2023-11-08 Thread pan2 . li
From: Pan Li This patch would like to fine the frm insn emit when we meet abnormal edge in the loop. Conceptually, we only need to emit once when abnormal instead of every iteration in the loop. This patch would like to fix this defect and only perform insert_insn_end_basic_block when at least o

[PATCH v1] Internal-fn: Add FLOATN support for l/ll round and rint [PR/112432]

2023-11-09 Thread pan2 . li
From: Pan Li The defined DEF_EXT_LIB_FLOATN_NX_BUILTINS functions should also have DEF_INTERNAL_FLT_FLOATN_FN instead of DEF_INTERNAL_FLT_FN for the FLOATN support. According to the glibc API and gcc builtin, we have below table for the FLOATN is supported or not. +-+---+

[PATCH v1] RISC-V: Add HFmode for l/ll round and rint autovec

2023-11-10 Thread pan2 . li
From: Pan Li The internal-fn has support the FLOATN already. This patch would like to re-enable the vector HFmode for the autovec for below standard name mode iterators. 1. lrint 2. llround For now the vector HFmodes are disabled to limit the impact, and the underlying FP16 rint/round autovec w

  1   2   3   4   5   6   >