Re: [PATCH] RISC-V: Refactor and cleanup vsetvl pass

2023-10-16 Thread Lehua Ding

OK, I'll split it as Juzhe suggested. Thanks.

On 2023/10/17 6:24, 钟居哲 wrote:

Yeah.  The refactor and rename stuff make the patch difference so messy.
It's not easy to read.

So, I suggest this patch split into these following sub-patch (each 
sub-patch not necessary compilable):


1. Refactor and clean up data structure layout
(The difference should be removing orignal 
avl_info/vl_vtype/vector_insn_info, add the new vector_insn_info).


2. Refactor compatible/fusion/available rule: The difference should be 
mostly on riscv-vector.def


3. Refactor and simplify local analysis (phase 1 and phase 2) into a 
single phase since 2 rounds local analysis (backward and forward) are 
redundant.


4. Introduce new LCM helper function (compute_reaching_defintion)

5. Split earliest fusion into 2 phases which make code easier maintain 
and read.


6. Remove all post optimizations

7. Adapt and Robostify testcases.


juzhe.zh...@rivai.ai

*From:* Kito Cheng <mailto:kito.ch...@gmail.com>
*Date:* 2023-10-16 23:38
*To:* Lehua Ding <mailto:lehua.d...@rivai.ai>
*CC:* GCC Patches <mailto:gcc-patches@gcc.gnu.org>; 钟居哲
<mailto:juzhe.zh...@rivai.ai>; Robin Dapp
<mailto:rdapp@gmail.com>; Palmer Dabbelt
<mailto:pal...@rivosinc.com>; Jeff Law <mailto:jeffreya...@gmail.com>
*Subject:* Re: [PATCH] RISC-V: Refactor and cleanup vsetvl pass
It's impossible to review, plz split into multiple small patch if
possible...

Lehua Ding mailto:lehua.d...@rivai.ai>> 於
2023年10月16日 週一 07:54 寫道:


This patch refactors and cleanups the vsetvl pass in order to
make the code
easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and
3 only maintain
    and modify this virtual CFG. Phase 4 performs insertion,
modification and
    deletion of vsetvl insns based on the virtual CFG. The Basic
block in the
    virtual CFG is called vsetvl_block_info and the vsetvl
information inside
    is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the
demand system,
    this Phase only fuse local vsetvl info in forward direction.
3. Refactor Phase 3, change the logic for determining whether to
uplift vsetvl
    info to a pred basic block to a more unified method that
there is a vsetvl
    info in the vsetvl defintion reaching in compatible with it.
4. Place all modification operations to the RTL in Phase 4 and
Phase 5.
    Phase 4 is responsible for inserting, modifying and deleting
vsetvl
    instructions based on fully optimized vsetvl infos. Phase 5
removes the avl
    operand from the RVV instruction and removes the unused dest
operand
    register from the vsetvl insns.

These modifications resulted in some testcases needing to be
updated. The reasons
for updating are summarized below:

1. more optimized
  
  vlmax_back_prop-25.c/vlmax_back_prop-26.c/vlmax_conflict-3.c/vsetvl-13.c

    vsetvl-23.c/
    avl_single-21.c/avl_single-23.c/avl_single-67.c/avl_single-68.c/
    avl_single-71.c/avl_single-89.c/avl_single-93.c/avl_single-95.c/
    avl_single-96.c
2. less unnecessary fusion
    avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/pr109773-1.c
3. local fuse direction (backward -> forward)
    scalar_move-1.c/
4. add some bugfix testcases..
    pr111037-3.c/pr111037-4.c
    avl_single-89.c

         PR target/111037
         PR target/111234
         PR target/111725

gcc/ChangeLog:

         * config/riscv/riscv-vsetvl.cc
(bitmap_union_of_preds_with_entry): New helper function.
         (debug): Removed.
         (compute_reaching_defintion): Compute reaching
defintion data.
         (enum vsetvl_type): Unchange.
         (vlmax_avl_p): Unchange.
         (enum emit_type): Unchange.
         (vlmul_to_str): Unchange.
         (vlmax_avl_insn_p): Removed.
         (policy_to_str): Unchange.
         (loop_basic_block_p): Removed.
         (valid_sew_p): Removed.
         (vsetvl_insn_p): Unchange.
         (vsetvl_vtype_change_only_p): Removed.
         (after_or_same_p): Removed.
         (before_p): Removed.
         (anticipatable_occurrence_p): Removed.
         (available_occurrence_p): Removed.
         (insn_should_be_added_p): Unchange.
         (get_all_sets): Unchange.
         (get_s

Re: [PATCH 2/2] RISC-V: Add assert of the number of vmerge in autovec cond testcases

2023-10-16 Thread Lehua Ding

Hi Jeff,

Can you replace riscv_vector with riscv_v?  That way this will still 
work after Joern commits his change to standardize on the riscv_v target 
selector.


OK with that change, no need to wait for a review on V2, just go ahead 
and blast it in.


No problem, I'll tweak it later and submit it. Thanks.

--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai



[PATCH] RISC-V: Fix failed testcase when use -cmodel=medany

2023-10-17 Thread Lehua Ding
This little path fix a failed testcase when use -cmodel=medany.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/cpymem-1.c: Split check.

---
 gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
index 9bb4904e8e9..549d6648104 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
@@ -50,7 +50,7 @@ void f2 (__INT32_TYPE__* a, __INT32_TYPE__* b, int l)
Use extern here so that we get a known alignment, lest
DATA_ALIGNMENT force us to make the scan pattern accomodate
code for different alignments depending on word size.
-** f3:
+** f3: { target { any-opts "-mcmodel=medlow" } }
 **lui\s+[ta][0-7],%hi\(a_a\)
 **lui\s+[ta][0-7],%hi\(a_b\)
 **addi\s+a4,[ta][0-7],%lo\(a_b\)
@@ -61,6 +61,16 @@ void f2 (__INT32_TYPE__* a, __INT32_TYPE__* b, int l)
 **ret
 */

+/*
+** f3: { target { any-opts "-mcmodel=medany" } }
+**lla\s+[ta][0-7],a_b
+**vsetivli\s+zero,16,e32,m4,ta,ma
+**vle32.v\s+v\d+,0\([ta][0-7]\)
+**lla\s+[ta][0-7],a_a
+**vse32\.v\s+v\d+,0\([ta][0-7]\)
+**ret
+*/
+
 extern struct { __INT32_TYPE__ a[16]; } a_a, a_b;

 void f3 ()
--
2.36.3



Re: [PATCH] RISC-V: Fix failed testcase when use -cmodel=medany

2023-10-17 Thread Lehua Ding

Committed, thanks Juzhe.

On 2023/10/17 17:58, juzhe.zh...@rivai.ai wrote:

OK


juzhe.zh...@rivai.ai

*From:* Lehua Ding <mailto:lehua.d...@rivai.ai>
*Date:* 2023-10-17 17:57
*To:* gcc-patches <mailto:gcc-patches@gcc.gnu.org>
*CC:* juzhe.zhong <mailto:juzhe.zh...@rivai.ai>; kito.cheng
<mailto:kito.ch...@gmail.com>; rdapp.gcc
<mailto:rdapp@gmail.com>; palmer <mailto:pal...@rivosinc.com>;
jeffreyalaw <mailto:jeffreya...@gmail.com>; lehua.ding
<mailto:lehua.d...@rivai.ai>
*Subject:* [PATCH] RISC-V: Fix failed testcase when use -cmodel=medany
This little path fix a failed testcase when use -cmodel=medany.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/cpymem-1.c: Split check.
---
gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c | 12 +++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
index 9bb4904e8e9..549d6648104 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
@@ -50,7 +50,7 @@ void f2 (__INT32_TYPE__* a, __INT32_TYPE__* b, int l)
     Use extern here so that we get a known alignment, lest
     DATA_ALIGNMENT force us to make the scan pattern accomodate
     code for different alignments depending on word size.
-** f3:
+** f3: { target { any-opts "-mcmodel=medlow" } }
**    lui\s+[ta][0-7],%hi\(a_a\)
**    lui\s+[ta][0-7],%hi\(a_b\)
**    addi\s+a4,[ta][0-7],%lo\(a_b\)
@@ -61,6 +61,16 @@ void f2 (__INT32_TYPE__* a, __INT32_TYPE__* b, int l)
**    ret
*/
+/*
+** f3: { target { any-opts "-mcmodel=medany" } }
+**    lla\s+[ta][0-7],a_b
+**    vsetivli\s+zero,16,e32,m4,ta,ma
+**    vle32.v\s+v\d+,0\([ta][0-7]\)
+**    lla\s+[ta][0-7],a_a
+**    vse32\.v\s+v\d+,0\([ta][0-7]\)
+**    ret
+*/
+
extern struct { __INT32_TYPE__ a[16]; } a_a, a_b;
void f3 ()
--
2.36.3



--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai



[PATCH V2 00/14] Refactor and cleanup vsetvl pass

2023-10-17 Thread Lehua Ding
This patch refactors and cleanups the vsetvl pass in order to make the code
easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only maintain
   and modify this virtual CFG. Phase 4 performs insertion, modification and
   deletion of vsetvl insns based on the virtual CFG. The Basic block in the
   virtual CFG is called vsetvl_block_info and the vsetvl information inside
   is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand system,
   this Phase only fuse local vsetvl info in forward direction.
3. Refactor Phase 3, change the logic for determining whether to uplift vsetvl
   info to a pred basic block to a more unified method that there is a vsetvl
   info in the vsetvl defintion reaching in compatible with it.
4. Place all modification operations to the RTL in Phase 4 and Phase 5.
   Phase 4 is responsible for inserting, modifying and deleting vsetvl
   instructions based on fully optimized vsetvl infos. Phase 5 removes the avl
   operand from the RVV instruction and removes the unused dest operand
   register from the vsetvl insns.

These modifications resulted in some testcases needing to be updated. The 
reasons
for updating are summarized below:

1. more optimized
   vlmax_back_prop-25.c/vlmax_back_prop-26.c/vlmax_conflict-3.c/
   vlmax_conflict-12.c/vsetvl-13.c/vsetvl-23.c/
   avl_single-23.c/avl_single-89.c/avl_single-95.c/pr109773-1.c
2. less unnecessary fusion
   avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c
3. local fuse direction (backward -> forward)
   scalar_move-1.c/
4. add some bugfix testcases.
   pr111037-3.c/pr111037-4.c
   avl_single-89.c

PR target/111037
PR target/111234
PR target/111725


Lehua Ding (14):
  RISC-V: P1: Refactor avl_info/vl_vtype_info/vector_insn_info
  RISC-V: P2: Refactor and cleanup demand system
  RISC-V: P3: Refactor class vector_infos_manager to pre_vsetvl
  RISC-V: P4: move method from class pass_vsetvl to pre_vsetvl
  RISC-V: P5: combine phase 1 and 2 into a single pahse
  RISC-V: P6: Add compute reaching definition data flow
  RISC-V: P7: Move earliest fuse and lcm code to pre_vsetvl class
  RISC-V: P8: Unified insert and delete of vsetvl insn into Phase 4
  RISC-V: P9: Cleanup post optimize phase
  RISC-V: P10: Cleanup helper functions
  RISC-V: P11:  Refactor vector_block_info to vsetvl_block_info class
  RISC-V: P12: Delete riscv-vsetvl.h
  RISC-V: P13:  Reorganize functions used to modify RTL
  RISC-V: P14: Adjust and add testcases

 gcc/config/riscv/riscv-vsetvl.cc  | 6530 +++--
 gcc/config/riscv/riscv-vsetvl.def |  634 +-
 gcc/config/riscv/riscv-vsetvl.h   |  488 --
 gcc/config/riscv/t-riscv  |2 +-
 .../gcc.target/riscv/rvv/base/pr111037-2.c|8 -
 .../gcc.target/riscv/rvv/base/scalar_move-1.c |2 +-
 .../riscv/rvv/vsetvl/avl_single-104.c |   35 +
 .../riscv/rvv/vsetvl/avl_single-105.c |   23 +
 .../riscv/rvv/vsetvl/avl_single-23.c  |7 +-
 .../riscv/rvv/vsetvl/avl_single-46.c  |3 +-
 .../riscv/rvv/vsetvl/avl_single-89.c  |8 +-
 .../riscv/rvv/vsetvl/avl_single-95.c  |2 +-
 .../riscv/rvv/vsetvl/imm_bb_prop-1.c  |7 +-
 .../gcc.target/riscv/rvv/vsetvl/pr109743-2.c  |2 +-
 .../gcc.target/riscv/rvv/vsetvl/pr109773-1.c  |2 +-
 .../gcc.target/riscv/rvv/vsetvl/pr111037-3.c  |   16 +
 .../pr111037-1.c => vsetvl/pr111037-4.c}  |5 +-
 .../riscv/rvv/vsetvl/vlmax_back_prop-25.c |   10 +-
 .../riscv/rvv/vsetvl/vlmax_back_prop-26.c |   10 +-
 .../riscv/rvv/vsetvl/vlmax_conflict-12.c  |1 -
 .../riscv/rvv/vsetvl/vlmax_conflict-3.c   |2 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-13.c   |4 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-18.c   |4 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-23.c   |2 +-
 24 files changed, 3084 insertions(+), 4723 deletions(-)
 delete mode 100644 gcc/config/riscv/riscv-vsetvl.h
 delete mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111037-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-104.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-105.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-3.c
 rename gcc/testsuite/gcc.target/riscv/rvv/{base/pr111037-1.c => 
vsetvl/pr111037-4.c} (74%)

--
2.36.3



[PATCH V2 01/14] RISC-V: P1: Refactor avl_info/vl_vtype_info/vector_insn_info

2023-10-17 Thread Lehua Ding
This sub-patch combine avl_info/vl_vtype_info/vector_insn_info to
a single class vsetvl_info.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (avl_info::avl_info): Removed.
(avl_info::single_source_equal_p): Ditto.
(avl_info::multiple_source_equal_p): Ditto.
(avl_info::operator=): Ditto.
(avl_info::operator==): Ditto.
(avl_info::operator!=): Ditto.
(avl_info::has_non_zero_avl): Ditto.
(vl_vtype_info::vl_vtype_info): Ditto.
(vl_vtype_info::operator==): Ditto.
(vl_vtype_info::operator!=): Ditto.
(vl_vtype_info::same_avl_p): Ditto.
(vl_vtype_info::same_vtype_p): Ditto.
(enum demand_flags): New enum.
(vl_vtype_info::same_vlmax_p): Removed.
(vector_insn_info::operator>=): Ditto.
(enum class): New demand types.
(vector_insn_info::operator==): Ditto.
(vector_insn_info::parse_insn): Ditto.
(class vsetvl_info): New class.
(vector_insn_info::compatible_p): Removed.
(vector_insn_info::skip_avl_compatible_p): Ditto.
(vector_insn_info::compatible_avl_p): Ditto.
(vector_insn_info::compatible_vtype_p): Ditto.
(vector_insn_info::available_p): Ditto.
(vector_insn_info::fuse_avl): Ditto.
(vector_insn_info::fuse_sew_lmul): Ditto.
(vector_insn_info::fuse_tail_policy): Ditto.
(vector_insn_info::fuse_mask_policy): Ditto.
(vector_insn_info::local_merge): Ditto.
(vector_insn_info::global_merge): Ditto.
(vector_insn_info::get_avl_or_vl_reg): Ditto.
(vector_insn_info::update_fault_first_load_avl):  Ditto.
(vlmul_to_str): Ditto.
(policy_to_str): Ditto.
(vector_insn_info::dump): Ditto.
* config/riscv/riscv-vsetvl.h (class avl_info): Ditto.
(struct vl_vtype_info): Ditto.
(class vector_insn_info): Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 1315 --
 gcc/config/riscv/riscv-vsetvl.h  |  261 --
 2 files changed, 515 insertions(+), 1061 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 4b06d93e7f9..79ba8466556 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1581,827 +1581,542 @@ vsetvl_dominated_by_p (const basic_block cfg_bb,
   return true;
 }

-avl_info::avl_info (const avl_info &other)
-{
-  m_value = other.get_value ();
-  m_source = other.get_source ();
-}
-
-avl_info::avl_info (rtx value_in, set_info *source_in)
-  : m_value (value_in), m_source (source_in)
-{}
-
-bool
-avl_info::single_source_equal_p (const avl_info &other) const
-{
-  set_info *set1 = m_source;
-  set_info *set2 = other.get_source ();
-  insn_info *insn1 = extract_single_source (set1);
-  insn_info *insn2 = extract_single_source (set2);
-  if (!insn1 || !insn2)
-return false;
-  return source_equal_p (insn1, insn2);
-}
-
-bool
-avl_info::multiple_source_equal_p (const avl_info &other) const
-{
-  /* When the def info is same in RTL_SSA namespace, it's safe
- to consider they are avl compatible.  */
-  if (m_source == other.get_source ())
-return true;
-
-  /* We only consider handle PHI node.  */
-  if (!m_source->insn ()->is_phi () || !other.get_source ()->insn ()->is_phi 
())
-return false;
-
-  phi_info *phi1 = as_a (m_source);
-  phi_info *phi2 = as_a (other.get_source ());
-
-  if (phi1->is_degenerate () && phi2->is_degenerate ())
-{
-  /* Degenerate PHI means the PHI node only have one input.  */
-
-  /* If both PHI nodes have the same single input in use list.
-We consider they are AVL compatible.  */
-  if (phi1->input_value (0) == phi2->input_value (0))
-   return true;
-}
-  /* TODO: We can support more optimization cases in the future.  */
-  return false;
-}
-
-avl_info &
-avl_info::operator= (const avl_info &other)
-{
-  m_value = other.get_value ();
-  m_source = other.get_source ();
-  return *this;
-}
-
-bool
-avl_info::operator== (const avl_info &other) const
-{
-  if (!m_value)
-return !other.get_value ();
-  if (!other.get_value ())
-return false;
-
-  if (GET_CODE (m_value) != GET_CODE (other.get_value ()))
-return false;
-
-  /* Handle CONST_INT AVL.  */
-  if (CONST_INT_P (m_value))
-return INTVAL (m_value) == INTVAL (other.get_value ());
-
-  /* Handle VLMAX AVL.  */
-  if (vlmax_avl_p (m_value))
-return vlmax_avl_p (other.get_value ());
-  if (vlmax_avl_p (other.get_value ()))
-return false;
-
-  /* If any source is undef value, we think they are not equal.  */
-  if (!m_source || !other.get_source ())
-return false;
-
-  /* If both sources are single source (defined by a single real RTL)
- and their definitions are same.  */
-  if (single_source_equal_p (other))
-return true;
-
-  return multiple_source_equal_p (other);
-}
-
-bool
-avl_info::operator!= (const avl_info &other) const
-{
-  return !(*this == other);
-}
-
-bool
-avl_info

[PATCH V2 03/14] RISC-V: P3: Refactor vector_infos_manager

2023-10-17 Thread Lehua Ding
This sub-patch refactor vector_infos_manager to a pre_vsetvl class
which is responsible for the entire lazy vsetvl jobs. There is no need
to introduce a separate vsetvl infos manager, because vsetvl infos are
modified by the optimization code.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc 
(vector_infos_manager::vector_infos_manager): Removed.
(class pre_vsetvl): New class.
(vector_infos_manager::create_expr): Removed.
(vector_infos_manager::get_expr_id): Removed.
(vector_infos_manager::all_same_ratio_p): Removed.
(vector_infos_manager::all_avail_in_compatible_p): Removed.
(vector_infos_manager::all_same_avl_p): Removed.
(vector_infos_manager::expr_set_num): Removed.
(vector_infos_manager::release): Removed.
(vector_infos_manager::create_bitmap_vectors): Removed.
(vector_infos_manager::free_bitmap_vectors): Removed.
(vector_infos_manager::dump): Removed.
* config/riscv/riscv-vsetvl.h (class vector_infos_manager): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 632 +--
 gcc/config/riscv/riscv-vsetvl.h  |  75 
 2 files changed, 257 insertions(+), 450 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index be40b6fdf4c..c219ad178bb 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2390,402 +2390,284 @@ public:
   }
 };

-vector_infos_manager::vector_infos_manager ()
+class pre_vsetvl
 {
-  vector_edge_list = nullptr;
-  vector_kill = nullptr;
-  vector_del = nullptr;
-  vector_insert = nullptr;
-  vector_antic = nullptr;
-  vector_transp = nullptr;
-  vector_comp = nullptr;
-  vector_avin = nullptr;
-  vector_avout = nullptr;
-  vector_antin = nullptr;
-  vector_antout = nullptr;
-  vector_earliest = nullptr;
-  vector_insn_infos.safe_grow_cleared (get_max_uid ());
-  vector_block_infos.safe_grow_cleared (last_basic_block_for_fn (cfun));
-  if (!optimize)
-{
-  basic_block cfg_bb;
-  rtx_insn *rinsn;
-  FOR_ALL_BB_FN (cfg_bb, cfun)
-   {
- vector_block_infos[cfg_bb->index].local_dem = vector_insn_info ();
- vector_block_infos[cfg_bb->index].reaching_out = vector_insn_info ();
- FOR_BB_INSNS (cfg_bb, rinsn)
-   vector_insn_infos[INSN_UID (rinsn)].parse_insn (rinsn);
-   }
-}
-  else
-{
-  for (const bb_info *bb : crtl->ssa->bbs ())
-   {
- vector_block_infos[bb->index ()].local_dem = vector_insn_info ();
- vector_block_infos[bb->index ()].reaching_out = vector_insn_info ();
- for (insn_info *insn : bb->real_insns ())
-   vector_insn_infos[insn->uid ()].parse_insn (insn);
- vector_block_infos[bb->index ()].probability = profile_probability ();
-   }
-}
-}
-
-void
-vector_infos_manager::create_expr (vector_insn_info &info)
-{
-  for (size_t i = 0; i < vector_exprs.length (); i++)
-if (*vector_exprs[i] == info)
-  return;
-  vector_exprs.safe_push (&info);
-}
-
-size_t
-vector_infos_manager::get_expr_id (const vector_insn_info &info) const
-{
-  for (size_t i = 0; i < vector_exprs.length (); i++)
-if (*vector_exprs[i] == info)
-  return i;
-  gcc_unreachable ();
-}
-
-auto_vec
-vector_infos_manager::get_all_available_exprs (
-  const vector_insn_info &info) const
-{
-  auto_vec available_list;
-  for (size_t i = 0; i < vector_exprs.length (); i++)
-if (info.available_p (*vector_exprs[i]))
-  available_list.safe_push (i);
-  return available_list;
-}
-
-bool
-vector_infos_manager::all_same_ratio_p (sbitmap bitdata) const
-{
-  if (bitmap_empty_p (bitdata))
-return false;
-
-  int ratio = -1;
-  unsigned int bb_index;
-  sbitmap_iterator sbi;
-
-  EXECUTE_IF_SET_IN_BITMAP (bitdata, 0, bb_index, sbi)
-{
-  if (ratio == -1)
-   ratio = vector_exprs[bb_index]->get_ratio ();
-  else if (vector_exprs[bb_index]->get_ratio () != ratio)
-   return false;
-}
-  return true;
-}
-
-/* Return TRUE if the incoming vector configuration state
-   to CFG_BB is compatible with the vector configuration
-   state in CFG_BB, FALSE otherwise.  */
-bool
-vector_infos_manager::all_avail_in_compatible_p (const basic_block cfg_bb) 
const
-{
-  const auto &info = vector_block_infos[cfg_bb->index].local_dem;
-  sbitmap avin = vector_avin[cfg_bb->index];
-  unsigned int bb_index;
-  sbitmap_iterator sbi;
-  EXECUTE_IF_SET_IN_BITMAP (avin, 0, bb_index, sbi)
-{
-  const auto &avin_info
-   = static_cast (*vector_exprs[bb_index]);
-  if (!info.compatible_p (avin_info))
-   return false;
-}
-  return true;
-}
-
-bool
-vector_infos_manager::all_same_avl_p (const basic_block cfg_bb,
- sbitmap bitdata) const
-{
-  if (bitmap_empty_p (bitdata))
-return false;
-
-  const auto &block_info = vector_block_infos[cfg_bb->index];
-  if (!block_info.local_dem.demand_p (DEMAND_AVL))
-return true;
-
-  avl

[PATCH V2 05/14] RISC-V: P5: combine phase 1 and 2

2023-10-17 Thread Lehua Ding
This sub-patch combine phase 1 and 2 to use the new demand system and
delay the insert of vsetvl insn into phase 4.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::fuse_local_vsetvl_info): 
New.
(pass_vsetvl::compute_local_backward_infos): Removed.
(pass_vsetvl::need_vsetvl): Removed.
(pass_vsetvl::transfer_before): Removed.
(pass_vsetvl::transfer_after): Removed.
(pass_vsetvl::emit_local_forward_vsetvls): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 269 ++-
 1 file changed, 123 insertions(+), 146 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3f07fde782f..33bdcec04d8 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2669,6 +2669,129 @@ public:
   }
 };

+void
+pre_vsetvl::fuse_local_vsetvl_info ()
+{
+  reg_def_loc
+= sbitmap_vector_alloc (last_basic_block_for_fn (cfun), GP_REG_LAST + 1);
+  bitmap_vector_clear (reg_def_loc, last_basic_block_for_fn (cfun));
+  bitmap_ones (reg_def_loc[ENTRY_BLOCK_PTR_FOR_FN (cfun)->index]);
+
+  for (bb_info *bb : crtl->ssa->bbs ())
+{
+  auto &block_info = get_block_info (bb);
+  block_info.m_bb = bb;
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "  Try fuse basic block %d\n", bb->index ());
+   }
+  auto_vec infos;
+  for (insn_info *insn : bb->real_nondebug_insns ())
+   {
+ vsetvl_info curr_info = vsetvl_info (insn);
+ if (curr_info.valid_p () || curr_info.unknown_p ())
+   infos.safe_push (curr_info);
+
+ /* Collecting GP registers modified by the current bb.  */
+ if (insn->is_real ())
+   for (def_info *def : insn->defs ())
+ if (def->is_reg () && GP_REG_P (def->regno ()))
+   bitmap_set_bit (reg_def_loc[bb->index ()], def->regno ());
+   }
+
+  vsetvl_info prev_info = vsetvl_info ();
+  prev_info.set_empty ();
+  for (auto &curr_info : infos)
+   {
+ if (prev_info.empty_p ())
+   prev_info = curr_info;
+ else if ((curr_info.unknown_p () && prev_info.valid_p ())
+  || (curr_info.valid_p () && prev_info.unknown_p ()))
+   {
+ block_info.infos.safe_push (prev_info);
+ prev_info = curr_info;
+   }
+ else if (curr_info.valid_p () && prev_info.valid_p ())
+   {
+ if (dem.available_with (prev_info, curr_info))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file,
+  "Ignore curr info since prev info "
+  "available with it:\n");
+ fprintf (dump_file, "  prev_info: ");
+ prev_info.dump (dump_file, "");
+ fprintf (dump_file, "  curr_info: ");
+ curr_info.dump (dump_file, "");
+ fprintf (dump_file, "\n");
+   }
+ if (!curr_info.use_by_non_rvv_insn_p ()
+ && vsetvl_insn_p (curr_info.get_insn ()->rtl ()))
+   delete_list.safe_push (curr_info);
+
+ if (curr_info.get_read_vl_insn ())
+   prev_info.set_read_vl_insn (curr_info.get_read_vl_insn ());
+   }
+ else if (dem.compatible_with (prev_info, curr_info))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "Fuse curr info since prev info "
+ "compatible with it:\n");
+ fprintf (dump_file, "  prev_info: ");
+ prev_info.dump (dump_file, "");
+ fprintf (dump_file, "  curr_info: ");
+ curr_info.dump (dump_file, "");
+   }
+ dem.merge_with (prev_info, curr_info);
+ if (curr_info.get_read_vl_insn ())
+   prev_info.set_read_vl_insn (curr_info.get_read_vl_insn ());
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "  prev_info after fused: ");
+ prev_info.dump (dump_file, "");
+ fprintf (dump_file, "\n");
+   }
+   }
+ else
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file,
+  "Cannot fuse uncompatible infos:\n");
+ fprintf (dump_file, "  prev_info: ");
+ prev_info.dump (dump_file, "   ");
+ fprintf (dump_file, "  curr_info: ");

[PATCH V2 02/14] RISC-V: P2: Refactor and cleanup demand system

2023-10-17 Thread Lehua Ding
This sub-patch refactor the demand system. I split the demand information
into three parts. They are sew and lmul related (sew_lmul_demand_type),
tail and mask policy related (policy_demand_type) and avl related
(avl_demand_type). Then we define three interfaces avaiable_with,
compatible_with and merge_with. avaiable_with is used to determine whether
the two vsetvl infos prev_info and next_info are available or not. If
prev_info is available for next_info, it means that the RVV insn
corresponding to next_info on the path from prev_info to next_info
can be used without inserting a separate vsetvl instruction. compatible_with
is used to determine whether prev_info is compatible with next_info, and if
so, merge_with can be used to merge the stricter demand information from
next_info into prev_info so that prev_info becomes available to next_info.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (incompatible_avl_p): Removed.
(different_sew_p): Removed.
(different_lmul_p): Removed.
(different_ratio_p): Removed.
(different_tail_policy_p): Removed.
(different_mask_policy_p): Removed.
(possible_zero_avl_p): Removed.
(second_ratio_invalid_for_first_sew_p): Removed.
(second_ratio_invalid_for_first_lmul_p): Removed.
(float_insn_valid_sew_p): Removed.
(second_sew_less_than_first_sew_p): Removed.
(first_sew_less_than_second_sew_p): Removed.
(compare_lmul): Removed.
(second_lmul_less_than_first_lmul_p): Removed.
(second_ratio_less_than_first_ratio_p): Removed.
(DEF_INCOMPATIBLE_COND): Removed.
(greatest_sew): Removed.
(first_sew): Removed.
(second_sew): Removed.
(first_vlmul): Removed.
(second_vlmul): Removed.
(first_ratio): Removed.
(second_ratio): Removed.
(vlmul_for_first_sew_second_ratio): Removed.
(vlmul_for_greatest_sew_second_ratio): Removed.
(ratio_for_second_sew_first_vlmul): Removed.
(DEF_SEW_LMUL_FUSE_RULE): Removed.
(always_unavailable): Removed.
(avl_unavailable_p): Removed.
(sew_unavailable_p): Removed.
(lmul_unavailable_p): Removed.
(ge_sew_unavailable_p): Removed.
(ge_sew_lmul_unavailable_p): Removed.
(ge_sew_ratio_unavailable_p): Removed.
(DEF_UNAVAILABLE_COND): Removed.
(same_sew_lmul_demand_p): Removed.
(propagate_avl_across_demands_p): Removed.
(reg_available_p): Removed.
(support_relaxed_compatible_p): Removed.
(class demand_system): New class.
(DEF_SEW_LMUL_RULE): New Marco.
(DEF_POLICY_RULE): New macro.
(DEF_AVL_RULE): New macro.
* config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Removed.
(DEF_SEW_LMUL_RULE): New macro.
(DEF_SEW_LMUL_FUSE_RULE): Removed.
(DEF_POLICY_RULE): New macro.
(DEF_UNAVAILABLE_COND): Removed.
(DEF_AVL_RULE): New macro.
(sew_lmul): New demand type.
(ratio_only): New demand type.
(sew_only): New demand type.
(ge_sew): New demand type.
(ratio_and_ge_sew): New demand type.
(tail_mask_policy): New demand type.
(tail_policy_only): New demand type.
(mask_policy_only): New demand type.
(ignore_policy): New demand type.
(avl): New demand type.
(non_zero_avl): New demand type.
(ignore_avl): New demand type.
* config/riscv/riscv-vsetvl.h (enum demand_type): Removed.
(enum demand_status): Removed.
(enum fusion_type): Removed.
(struct demands_pair): Removed.
(struct demands_cond): Removed.
(struct demands_fuse_rule): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc  | 1062 ++---
 gcc/config/riscv/riscv-vsetvl.def |  634 -
 gcc/config/riscv/riscv-vsetvl.h   |   79 ---
 3 files changed, 814 insertions(+), 961 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 79ba8466556..be40b6fdf4c 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1091,401 +1091,6 @@ calculate_vlmul (unsigned int sew, unsigned int ratio)
   return LMUL_RESERVED;
 }

-static bool
-incompatible_avl_p (const vector_insn_info &info1,
-   const vector_insn_info &info2)
-{
-  return !info1.compatible_avl_p (info2) && !info2.compatible_avl_p (info1);
-}
-
-static bool
-different_sew_p (const vector_insn_info &info1, const vector_insn_info &info2)
-{
-  return info1.get_sew () != info2.get_sew ();
-}
-
-static bool
-different_lmul_p (const vector_insn_info &info1, const vector_insn_info &info2)
-{
-  return info1.get_vlmul () != info2.get_vlmul ();
-}
-
-static bool
-different_ratio_p (const vector_insn_info &info1, const vector_insn_info 
&info2)
-{
-  return info1.get_ratio () != info2.get_ratio ();
-}
-
-static bool
-different_tail_policy_p (const vect

[PATCH V2 04/14] RISC-V: P4: move method from pass_vsetvl to pre_vsetvl

2023-10-17 Thread Lehua Ding
This sub-patch remove the method about optimize vsetvl infos into
class pre_vsetvl.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::get_vector_info): Removed.
(pass_vsetvl::get_block_info): Removed.
(pass_vsetvl::update_vector_info): Removed.
(pass_vsetvl::update_block_info): Removed.
(pass_vsetvl::simple_vsetvl): Removed.
(pass_vsetvl::lazy_vsetvl): Removed.
(pass_vsetvl::execute): Removed.
(make_pass_vsetvl): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 228 ---
 1 file changed, 87 insertions(+), 141 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index c219ad178bb..3f07fde782f 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2684,54 +2684,8 @@ const pass_data pass_data_vsetvl = {
 class pass_vsetvl : public rtl_opt_pass
 {
 private:
-  vector_infos_manager *m_vector_manager;
-
-  const vector_insn_info &get_vector_info (const rtx_insn *) const;
-  const vector_insn_info &get_vector_info (const insn_info *) const;
-  const vector_block_info &get_block_info (const basic_block) const;
-  const vector_block_info &get_block_info (const bb_info *) const;
-  vector_block_info &get_block_info (const basic_block);
-  vector_block_info &get_block_info (const bb_info *);
-  void update_vector_info (const insn_info *, const vector_insn_info &);
-  void update_block_info (int, profile_probability, const vector_insn_info &);
-
-  void simple_vsetvl (void) const;
-  void lazy_vsetvl (void);
-
-  /* Phase 1.  */
-  void compute_local_backward_infos (const bb_info *);
-
-  /* Phase 2.  */
-  bool need_vsetvl (const vector_insn_info &, const vector_insn_info &) const;
-  void transfer_before (vector_insn_info &, insn_info *) const;
-  void transfer_after (vector_insn_info &, insn_info *) const;
-  void emit_local_forward_vsetvls (const bb_info *);
-
-  /* Phase 3.  */
-  bool earliest_fusion (void);
-  void vsetvl_fusion (void);
-
-  /* Phase 4.  */
-  void prune_expressions (void);
-  void compute_local_properties (void);
-  bool can_refine_vsetvl_p (const basic_block, const vector_insn_info &) const;
-  void refine_vsetvls (void) const;
-  void cleanup_vsetvls (void);
-  bool commit_vsetvls (void);
-  void pre_vsetvl (void);
-
-  /* Phase 5.  */
-  rtx_insn *get_vsetvl_at_end (const bb_info *, vector_insn_info *) const;
-  void local_eliminate_vsetvl_insn (const bb_info *) const;
-  bool global_eliminate_vsetvl_insn (const bb_info *) const;
-  void ssa_post_optimization (void) const;
-
-  /* Phase 6.  */
-  void df_post_optimization (void) const;
-
-  void init (void);
-  void done (void);
-  void compute_probabilities (void);
+  void simple_vsetvl ();
+  void lazy_vsetvl ();

 public:
   pass_vsetvl (gcc::context *ctxt) : rtl_opt_pass (pass_data_vsetvl, ctxt) {}
@@ -2741,69 +2695,11 @@ public:
   virtual unsigned int execute (function *) final override;
 }; // class pass_vsetvl

-const vector_insn_info &
-pass_vsetvl::get_vector_info (const rtx_insn *i) const
-{
-  return m_vector_manager->vector_insn_infos[INSN_UID (i)];
-}
-
-const vector_insn_info &
-pass_vsetvl::get_vector_info (const insn_info *i) const
-{
-  return m_vector_manager->vector_insn_infos[i->uid ()];
-}
-
-const vector_block_info &
-pass_vsetvl::get_block_info (const basic_block bb) const
-{
-  return m_vector_manager->vector_block_infos[bb->index];
-}
-
-const vector_block_info &
-pass_vsetvl::get_block_info (const bb_info *bb) const
-{
-  return m_vector_manager->vector_block_infos[bb->index ()];
-}
-
-vector_block_info &
-pass_vsetvl::get_block_info (const basic_block bb)
-{
-  return m_vector_manager->vector_block_infos[bb->index];
-}
-
-vector_block_info &
-pass_vsetvl::get_block_info (const bb_info *bb)
-{
-  return m_vector_manager->vector_block_infos[bb->index ()];
-}
-
-void
-pass_vsetvl::update_vector_info (const insn_info *i,
-const vector_insn_info &new_info)
-{
-  m_vector_manager->vector_insn_infos[i->uid ()] = new_info;
-}
-
 void
-pass_vsetvl::update_block_info (int index, profile_probability prob,
-   const vector_insn_info &new_info)
-{
-  m_vector_manager->vector_block_infos[index].probability = prob;
-  if (m_vector_manager->vector_block_infos[index].local_dem
-  == m_vector_manager->vector_block_infos[index].reaching_out)
-m_vector_manager->vector_block_infos[index].local_dem = new_info;
-  m_vector_manager->vector_block_infos[index].reaching_out = new_info;
-}
-
-/* Simple m_vsetvl_insert vsetvl for optimize == 0.  */
-void
-pass_vsetvl::simple_vsetvl (void) const
+pass_vsetvl::simple_vsetvl ()
 {
   if (dump_file)
-fprintf (dump_file,
-"\nEntering Simple VSETVL PASS and Handling %d basic blocks for "
-"function:%s\n",
-n_basic_blocks_for_fn (cfun), function_name (cfun));
+fprintf (dump_file, "\nEntering Simple VSETVL PASS\n"

[PATCH V2 07/14] RISC-V: P7: Move earliest fuse and lcm code to pre_vsetvl class

2023-10-17 Thread Lehua Ding
This patch adjust move the code phase 2 and 3 from pass_vsetvl to
pre_vsetvl class.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::earliest_fuse_vsetvl_info): 
New.
(pre_vsetvl::pre_global_vsetvl_info): New.
(pass_vsetvl::prune_expressions): Removed.
(pass_vsetvl::compute_local_properties): Removed.
(pass_vsetvl::earliest_fusion): Removed.
(pass_vsetvl::vsetvl_fusion): Removed.
(pass_vsetvl::pre_vsetvl): Removed.
(pass_vsetvl::compute_probabilities): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 829 +++
 1 file changed, 398 insertions(+), 431 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index b1269e8cf4f..a112895a283 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -3260,6 +3260,404 @@ pre_vsetvl::fuse_local_vsetvl_info ()
 }
 }

+bool
+pre_vsetvl::earliest_fuse_vsetvl_info ()
+{
+  compute_avl_def_data ();
+  compute_vsetvl_def_data ();
+  compute_vsetvl_lcm_data ();
+
+  unsigned num_exprs = exprs.length ();
+  struct edge_list *edges = create_edge_list ();
+  unsigned num_edges = NUM_EDGES (edges);
+  sbitmap *antin
+= sbitmap_vector_alloc (last_basic_block_for_fn (cfun), num_exprs);
+  sbitmap *antout
+= sbitmap_vector_alloc (last_basic_block_for_fn (cfun), num_exprs);
+
+  sbitmap *earliest = sbitmap_vector_alloc (num_edges, num_exprs);
+
+  compute_available (avloc, kill, avout, avin);
+  compute_antinout_edge (antloc, transp, antin, antout);
+  compute_earliest (edges, num_exprs, antin, antout, avout, kill, earliest);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file, "\n  Compute LCM earliest insert data:\n\n");
+  fprintf (dump_file, "Expression List (%u):\n", num_exprs);
+  for (unsigned i = 0; i < num_exprs; i++)
+   {
+ const auto &info = *exprs[i];
+ fprintf (dump_file, "  Expr[%u]: ", i);
+ info.dump (dump_file, "");
+   }
+  fprintf (dump_file, "\nbitmap data:\n");
+  for (const bb_info *bb : crtl->ssa->bbs ())
+   {
+ unsigned int i = bb->index ();
+ fprintf (dump_file, "  BB %u:\n", i);
+ fprintf (dump_file, "avloc: ");
+ dump_bitmap_file (dump_file, avloc[i]);
+ fprintf (dump_file, "kill: ");
+ dump_bitmap_file (dump_file, kill[i]);
+ fprintf (dump_file, "antloc: ");
+ dump_bitmap_file (dump_file, antloc[i]);
+ fprintf (dump_file, "transp: ");
+ dump_bitmap_file (dump_file, transp[i]);
+
+ fprintf (dump_file, "avin: ");
+ dump_bitmap_file (dump_file, avin[i]);
+ fprintf (dump_file, "avout: ");
+ dump_bitmap_file (dump_file, avout[i]);
+ fprintf (dump_file, "antin: ");
+ dump_bitmap_file (dump_file, antin[i]);
+ fprintf (dump_file, "antout: ");
+ dump_bitmap_file (dump_file, antout[i]);
+   }
+  fprintf (dump_file, "\n");
+  fprintf (dump_file, "  earliest:\n");
+  for (unsigned ed = 0; ed < num_edges; ed++)
+   {
+ edge eg = INDEX_EDGE (edges, ed);
+
+ if (bitmap_empty_p (earliest[ed]))
+   continue;
+ fprintf (dump_file, "Edge(bb %u -> bb %u): ", eg->src->index,
+  eg->dest->index);
+ dump_bitmap_file (dump_file, earliest[ed]);
+   }
+  fprintf (dump_file, "\n");
+}
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file, "Fused global info result:\n");
+}
+
+  bool changed = false;
+  for (unsigned ed = 0; ed < num_edges; ed++)
+{
+  sbitmap e = earliest[ed];
+  if (bitmap_empty_p (e))
+   continue;
+
+  unsigned int expr_index;
+  sbitmap_iterator sbi;
+  EXECUTE_IF_SET_IN_BITMAP (e, 0, expr_index, sbi)
+   {
+ vsetvl_info &curr_info = *exprs[expr_index];
+ if (!curr_info.valid_p ())
+   continue;
+
+ edge eg = INDEX_EDGE (edges, ed);
+ if (eg->probability == profile_probability::never ())
+   continue;
+ if (eg->src == ENTRY_BLOCK_PTR_FOR_FN (cfun)
+ || eg->dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
+   continue;
+
+ vsetvl_block_info &src_block_info = get_block_info (eg->src);
+ vsetvl_block_info &dest_block_info = get_block_info (eg->dest);
+
+ if (src_block_info.probability
+ == profile_probability::uninitialized ())
+   continue;
+
+ if (src_block_info.empty_p ())
+   {
+ vsetvl_info new_curr_info = curr_info;
+ new_curr_info.set_bb (crtl->ssa->bb (eg->dest));
+ bool has_compatible_p = false;
+ unsigned int def_expr_index;
+ sbitmap_iterator sbi2;
+ EXECUTE_IF_SET_IN_BITMAP (
+   vsetvl_d

[PATCH V2 06/14] RISC-V: P6: Add computing reaching definition data flow

2023-10-17 Thread Lehua Ding
This sub-patch add some helper functions for computing reaching defintion data
and three computational functions for different object. These three functions
are used by phase 2 and 3.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry): New.
(compute_reaching_defintion): New.
(pre_vsetvl::compute_avl_def_data): New.
(pre_vsetvl::compute_vsetvl_def_data): New.
(pre_vsetvl::compute_vsetvl_lcm_data): New.

---
 gcc/config/riscv/riscv-vsetvl.cc | 468 +++
 1 file changed, 468 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 33bdcec04d8..b1269e8cf4f 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -103,6 +103,121 @@ along with GCC; see the file COPYING3.  If not see
 using namespace rtl_ssa;
 using namespace riscv_vector;

+/* Set the bitmap DST to the union of SRC of predecessors of
+   basic block B.
+   It's a bit different from bitmap_union_of_preds in cfganal.cc. This function
+   takes into account the case where pred is ENTRY basic block. The main reason
+   for this difference is to make it easier to insert some special value into
+   the ENTRY base block. For example, vsetvl_info with a status of UNKNOW.  */
+static void
+bitmap_union_of_preds_with_entry (sbitmap dst, sbitmap *src, basic_block b)
+{
+  unsigned int set_size = dst->size;
+  edge e;
+  unsigned ix;
+
+  for (ix = 0; ix < EDGE_COUNT (b->preds); ix++)
+{
+  e = EDGE_PRED (b, ix);
+  bitmap_copy (dst, src[e->src->index]);
+  break;
+}
+
+  if (ix == EDGE_COUNT (b->preds))
+bitmap_clear (dst);
+  else
+for (ix++; ix < EDGE_COUNT (b->preds); ix++)
+  {
+   unsigned int i;
+   SBITMAP_ELT_TYPE *p, *r;
+
+   e = EDGE_PRED (b, ix);
+   p = src[e->src->index]->elms;
+   r = dst->elms;
+   for (i = 0; i < set_size; i++)
+ *r++ |= *p++;
+  }
+}
+
+/* Compute the reaching defintion in and out based on the gen and KILL
+   informations in each Base Blocks.
+   This function references the compute_avaiable implementation in lcm.cc  */
+static void
+compute_reaching_defintion (sbitmap *gen, sbitmap *kill, sbitmap *in,
+   sbitmap *out)
+{
+  edge e;
+  basic_block *worklist, *qin, *qout, *qend, bb;
+  unsigned int qlen;
+  edge_iterator ei;
+
+  /* Allocate a worklist array/queue.  Entries are only added to the
+ list if they were not already on the list.  So the size is
+ bounded by the number of basic blocks.  */
+  qin = qout = worklist
+= XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS);
+
+  /* Put every block on the worklist; this is necessary because of the
+ optimistic initialization of AVOUT above.  Use reverse postorder
+ to make the forward dataflow problem require less iterations.  */
+  int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS);
+  int n = pre_and_rev_post_order_compute_fn (cfun, NULL, rpo, false);
+  for (int i = 0; i < n; ++i)
+{
+  bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]);
+  *qin++ = bb;
+  bb->aux = bb;
+}
+  free (rpo);
+
+  qin = worklist;
+  qend = &worklist[n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS];
+  qlen = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
+
+  /* Mark blocks which are successors of the entry block so that we
+ can easily identify them below.  */
+  FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs)
+e->dest->aux = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+
+  /* Iterate until the worklist is empty.  */
+  while (qlen)
+{
+  /* Take the first entry off the worklist.  */
+  bb = *qout++;
+  qlen--;
+
+  if (qout >= qend)
+   qout = worklist;
+
+  /* Do not clear the aux field for blocks which are successors of the
+ENTRY block.  That way we never add then to the worklist again.  */
+  if (bb->aux != ENTRY_BLOCK_PTR_FOR_FN (cfun))
+   bb->aux = NULL;
+
+  bitmap_union_of_preds_with_entry (in[bb->index], out, bb);
+
+  if (bitmap_ior_and_compl (out[bb->index], gen[bb->index], in[bb->index],
+   kill[bb->index]))
+   /* If the out state of this block changed, then we need
+  to add the successors of this block to the worklist
+  if they are not already on the worklist.  */
+   FOR_EACH_EDGE (e, ei, bb->succs)
+ if (!e->dest->aux && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
+   {
+ *qin++ = e->dest;
+ e->dest->aux = e;
+ qlen++;
+
+ if (qin >= qend)
+   qin = worklist;
+   }
+}
+
+  clear_aux_for_edges ();
+  clear_aux_for_blocks ();
+  free (worklist);
+}
+
 static CONSTEXPR const unsigned ALL_SEW[] = {8, 16, 32, 64};
 static CONSTEXPR const vlmul_type ALL_LMUL[]
   = {LMUL_1, LMUL_2, LMUL_4, LMUL_8, LMUL_F8, LMUL_F4, LMUL_F2};
@@ -2669,6 +2784,359 

[PATCH V2 09/14] RISC-V: P9: Cleanup post optimize phase

2023-10-17 Thread Lehua Ding
This sub-patch deletes partial post optimize code(which implement
in the main phase) and move the remain cleanup code to pre_vsetvl class.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::cleaup): New.
(pre_vsetvl::remove_avl_operand): New.
(pre_vsetvl::remove_unused_dest_operand): New.
(pass_vsetvl::get_vsetvl_at_end): Removed.
(local_avl_compatible_p): Removed.
(pass_vsetvl::local_eliminate_vsetvl_insn): Removed.
(get_first_vsetvl_before_rvv_insns): Removed.
(pass_vsetvl::global_eliminate_vsetvl_insn): Removed.
(pass_vsetvl::ssa_post_optimization): Removed.
(has_no_uses): Removed.
(pass_vsetvl::df_post_optimization): Removed.
(pass_vsetvl::init): Removed.
(pass_vsetvl::done): Removed.
(pass_vsetvl::lazy_vsetvl): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 675 ---
 1 file changed, 76 insertions(+), 599 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 5d84d290e9e..ac636623b3f 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -3791,6 +3791,82 @@ pre_vsetvl::emit_vsetvl ()
 commit_edge_insertions ();
 }

+void
+pre_vsetvl::cleaup ()
+{
+  remove_avl_operand ();
+  remove_unused_dest_operand ();
+}
+
+void
+pre_vsetvl::remove_avl_operand ()
+{
+  for (const bb_info *bb : crtl->ssa->bbs ())
+for (insn_info *insn : bb->real_nondebug_insns ())
+  {
+   rtx_insn *rinsn = insn->rtl ();
+   /* Erase the AVL operand from the instruction.  */
+   if (!has_vl_op (rinsn) || !REG_P (get_vl (rinsn)))
+ continue;
+   rtx avl = get_vl (rinsn);
+   if (count_regno_occurrences (rinsn, REGNO (avl)) == 1)
+ {
+   /* Get the list of uses for the new instruction.  */
+   auto attempt = crtl->ssa->new_change_attempt ();
+   insn_change change (insn);
+   /* Remove the use of the substituted value.  */
+   access_array_builder uses_builder (attempt);
+   uses_builder.reserve (insn->num_uses () - 1);
+   for (use_info *use : insn->uses ())
+ if (use != find_access (insn->uses (), REGNO (avl)))
+   uses_builder.quick_push (use);
+   use_array new_uses = use_array (uses_builder.finish ());
+   change.new_uses = new_uses;
+   change.move_range = insn->ebb ()->insn_range ();
+   rtx pat;
+   if (fault_first_load_p (rinsn))
+ pat = simplify_replace_rtx (PATTERN (rinsn), avl, const0_rtx);
+   else
+ {
+   rtx set = single_set (rinsn);
+   rtx src = simplify_replace_rtx (SET_SRC (set), avl, const0_rtx);
+   pat = gen_rtx_SET (SET_DEST (set), src);
+ }
+   bool ok = change_insn (crtl->ssa, change, insn, pat);
+   gcc_assert (ok);
+ }
+  }
+}
+
+void
+pre_vsetvl::remove_unused_dest_operand ()
+{
+  df_analyze ();
+  hash_set to_delete;
+  basic_block cfg_bb;
+  rtx_insn *rinsn;
+  FOR_ALL_BB_FN (cfg_bb, cfun)
+{
+  FOR_BB_INSNS (cfg_bb, rinsn)
+   {
+ if (NONDEBUG_INSN_P (rinsn) && vsetvl_insn_p (rinsn))
+   {
+ rtx vl = get_vl (rinsn);
+ vsetvl_info info = vsetvl_info (rinsn);
+ if (has_no_uses (cfg_bb, rinsn, REGNO (vl)))
+   {
+ if (!info.has_vlmax_avl ())
+   {
+ rtx new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, info,
+   NULL_RTX);
+ validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat,
+  false);
+   }
+   }
+   }
+   }
+}
+}

 const pass_data pass_data_vsetvl = {
   RTL_PASS, /* type */
@@ -3923,602 +3999,3 @@ make_pass_vsetvl (gcc::context *ctxt)
 {
   return new pass_vsetvl (ctxt);
 }
-
-/* Some instruction can not be accessed in RTL_SSA when we don't re-init
-   the new RTL_SSA framework but it is definetely at the END of the block.
-
-  Here we optimize the VSETVL is hoisted by LCM:
-
-   Before LCM:
- bb 1:
-   vsetvli a5,a2,e32,m1,ta,mu
- bb 2:
-   vsetvli zero,a5,e32,m1,ta,mu
-   ...
-
-   After LCM:
- bb 1:
-   vsetvli a5,a2,e32,m1,ta,mu
-   LCM INSERTED: vsetvli zero,a5,e32,m1,ta,mu --> eliminate
- bb 2:
-   ...
-   */
-rtx_insn *
-pass_vsetvl::get_vsetvl_at_end (const bb_info *bb, vector_insn_info *dem) const
-{
-  rtx_insn *end_vsetvl = BB_END (bb->cfg_bb ());
-  if (end_vsetvl && NONDEBUG_INSN_P (end_vsetvl))
-{
-  if (JUMP_P (end_vsetvl))
-   end_vsetvl = PREV_INSN (end_vsetvl);
-
-  if (NONDEBUG_INSN_P (end_vsetvl)
- && vsetvl_discard_result_insn_p (end_vsetvl))
-   {
- /* Only handle single succ. here, multiple succ. is much
-mo

[PATCH V2 12/14] RISC-V: P12: Delete riscv-vsetvl.h

2023-10-17 Thread Lehua Ding
This sub-patch delete the unused header file riscv-vsetvl.h
since we no need export any function.

gcc/ChangeLog:

* config/riscv/t-riscv: Removed riscv-vsetvl.h
* config/riscv/riscv-vsetvl.h: Removed.

---
 gcc/config/riscv/riscv-vsetvl.h | 59 -
 gcc/config/riscv/t-riscv|  2 +-
 2 files changed, 1 insertion(+), 60 deletions(-)
 delete mode 100644 gcc/config/riscv/riscv-vsetvl.h

diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h
deleted file mode 100644
index 16c84e0684b..000
--- a/gcc/config/riscv/riscv-vsetvl.h
+++ /dev/null
@@ -1,59 +0,0 @@
-/* VSETVL pass header for RISC-V 'V' Extension for GNU compiler.
-   Copyright (C) 2022-2023 Free Software Foundation, Inc.
-   Contributed by Juzhe Zhong (juzhe.zh...@rivai.ai), RiVAI Technologies Ltd.
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or(at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-.  */
-
-#ifndef GCC_RISCV_VSETVL_H
-#define GCC_RISCV_VSETVL_H
-
-namespace riscv_vector {
-
-/* Classification of vsetvl instruction.  */
-enum vsetvl_type
-{
-  VSETVL_NORMAL,
-  VSETVL_VTYPE_CHANGE_ONLY,
-  VSETVL_DISCARD_RESULT,
-  NUM_VSETVL_TYPE
-};
-
-enum emit_type
-{
-  /* emit_insn directly.  */
-  EMIT_DIRECT,
-  EMIT_BEFORE,
-  EMIT_AFTER,
-};
-
-enum def_type
-{
-  REAL_SET = 1 << 0,
-  PHI_SET = 1 << 1,
-  BB_HEAD_SET = 1 << 2,
-  BB_END_SET = 1 << 3,
-  /* ??? TODO: In RTL_SSA framework, we have REAL_SET,
- PHI_SET, BB_HEAD_SET, BB_END_SET and
- CLOBBER_DEF def_info types. Currently,
- we conservatively do not optimize clobber
- def since we don't see the case that we
- need to optimize it.  */
-  CLOBBER_DEF = 1 << 4
-};
-
-} // namespace riscv_vector
-#endif
diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv
index f137e1f17ef..dd17056fe82 100644
--- a/gcc/config/riscv/t-riscv
+++ b/gcc/config/riscv/t-riscv
@@ -64,7 +64,7 @@ riscv-vsetvl.o: $(srcdir)/config/riscv/riscv-vsetvl.cc \
   $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(REGS_H) \
   $(TARGET_H) tree-pass.h df.h rtl-ssa.h cfgcleanup.h insn-config.h \
   insn-attr.h insn-opinit.h tm-constrs.h cfgrtl.h cfganal.h lcm.h \
-  predict.h profile-count.h $(srcdir)/config/riscv/riscv-vsetvl.h \
+  predict.h profile-count.h \
   $(srcdir)/config/riscv/riscv-vsetvl.def
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
$(srcdir)/config/riscv/riscv-vsetvl.cc
--
2.36.3



[PATCH V2 08/14] RISC-V: P8: Unified insert and delete of vsetvl insn into Phase 4

2023-10-17 Thread Lehua Ding
This sub-patch move the modification of rtl codes from pass_vsetvl
into pre_vsetvl class.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): New.
(pass_vsetvl::can_refine_vsetvl_p): Removed.
(pass_vsetvl::refine_vsetvls): Removed.
(pass_vsetvl::cleanup_vsetvls): Removed.
(pass_vsetvl::commit_vsetvls): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 389 +++
 1 file changed, 134 insertions(+), 255 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index a112895a283..5d84d290e9e 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -3658,6 +3658,140 @@ pre_vsetvl::pre_global_vsetvl_info ()
 }
 }

+void
+pre_vsetvl::emit_vsetvl ()
+{
+  bool need_commit = false;
+
+  for (const bb_info *bb : crtl->ssa->bbs ())
+{
+  for (const auto &curr_info : get_block_info (bb).infos)
+   {
+ insn_info *insn = curr_info.get_insn ();
+ if (curr_info.ignore_p ())
+   {
+ if (vsetvl_insn_p (insn->rtl ()))
+   eliminate_insn (insn->rtl ());
+ continue;
+   }
+ else if (curr_info.valid_p ())
+   {
+ if (vsetvl_insn_p (insn->rtl ()))
+   {
+ const vsetvl_info temp = vsetvl_info (insn);
+ if (!(curr_info == temp))
+   {
+ if (dump_file)
+   {
+ fprintf (dump_file, "\n  Change vsetvl info from: ");
+ temp.dump (dump_file, "");
+ fprintf (dump_file, "  to: ");
+ curr_info.dump (dump_file, "");
+   }
+ change_vsetvl_insn (insn, curr_info);
+   }
+   }
+ else
+   {
+ if (dump_file)
+   {
+ fprintf (dump_file,
+  "\n  Insert vsetvl info before insn %d: ",
+  insn->uid ());
+ curr_info.dump (dump_file, "");
+   }
+ insert_vsetvl (EMIT_BEFORE, insn->rtl (), curr_info);
+   }
+   }
+   }
+}
+
+  for (const vsetvl_info &item : delete_list)
+{
+  gcc_assert (vsetvl_insn_p (item.get_insn ()->rtl ()));
+  eliminate_insn (item.get_insn ()->rtl ());
+}
+
+  /* Insert vsetvl as LCM suggest. */
+  for (int ed = 0; ed < NUM_EDGES (edges); ed++)
+{
+  edge eg = INDEX_EDGE (edges, ed);
+  sbitmap i = insert[ed];
+  if (bitmap_count_bits (i) < 1)
+   continue;
+
+  if (bitmap_count_bits (i) > 1)
+   /* For code with infinite loop (e.g. pr61634.c), The data flow is
+  completely wrong.  */
+   continue;
+
+  gcc_assert (bitmap_count_bits (i) == 1);
+  unsigned expr_index = bitmap_first_set_bit (i);
+  const vsetvl_info &info = *exprs[expr_index];
+  gcc_assert (info.valid_p ());
+  if (dump_file)
+   {
+ fprintf (dump_file,
+  "\n  Insert vsetvl info at edge(bb %u -> bb %u): ",
+  eg->src->index, eg->dest->index);
+ info.dump (dump_file, "");
+   }
+  rtl_profile_for_edge (eg);
+  start_sequence ();
+
+  insn_info *insn = info.get_insn ();
+  insert_vsetvl (EMIT_DIRECT, insn->rtl (), info);
+  rtx_insn *rinsn = get_insns ();
+  end_sequence ();
+  default_rtl_profile ();
+
+  /* We should not get an abnormal edge here.  */
+  gcc_assert (!(eg->flags & EDGE_ABNORMAL));
+  need_commit = true;
+  insert_insn_on_edge (rinsn, eg);
+}
+
+  /* Insert vsetvl info that was not deleted after lift up.  */
+  for (const bb_info *bb : crtl->ssa->bbs ())
+{
+  const vsetvl_block_info &block_info = get_block_info (bb);
+  if (!block_info.has_info ())
+   continue;
+
+  const vsetvl_info &footer_info = block_info.get_footer_info ();
+  insn_info *insn = footer_info.get_insn ();
+
+  if (footer_info.ignore_p ())
+   continue;
+
+  edge eg;
+  edge_iterator eg_iterator;
+  FOR_EACH_EDGE (eg, eg_iterator, bb->cfg_bb ()->succs)
+   {
+ gcc_assert (!(eg->flags & EDGE_ABNORMAL));
+ if (dump_file)
+   {
+ fprintf (
+   dump_file,
+   "\n  Insert missed vsetvl info at edge(bb %u -> bb %u): ",
+   eg->src->index, eg->dest->index);
+ footer_info.dump (dump_file, "");
+   }
+ start_sequence ();
+ insert_vsetvl (EMIT_DIRECT, insn->rtl (), footer_info);
+ rtx_insn *rinsn = get_insns ();
+ end_sequence ();
+ default_rtl_profile ();
+ insert_insn_on_edge (rinsn, eg);
+ need_commit = true;
+   }
+}
+
+  if (need_commit)
+commit_edge_inse

[PATCH V2 11/14] RISC-V: P11: Adjust vector_block_info to vsetvl_block_info class

2023-10-17 Thread Lehua Ding
This sub-patch adjust vector_block_info codes and rename to
vsetvl_block_info.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (class vsetvl_block_info): New.
* config/riscv/riscv-vsetvl.h (struct vector_block_info): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 55 +++-
 gcc/config/riscv/riscv-vsetvl.h  | 14 
 2 files changed, 54 insertions(+), 15 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index b5ed1ea774a..d91b0272d9f 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -85,7 +85,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "predict.h"
 #include "profile-count.h"
 #include "gcse.h"
-#include "riscv-vsetvl.h"

 using namespace rtl_ssa;
 using namespace riscv_vector;
@@ -1218,6 +1217,60 @@ public:
   }
 };

+class vsetvl_block_info
+{
+public:
+  /* The static execute probability of the demand info.  */
+  profile_probability probability;
+
+  auto_vec infos;
+  vsetvl_info m_info;
+  bb_info *m_bb;
+
+  bool full_available;
+
+  vsetvl_block_info () : m_bb (nullptr), full_available (false)
+  {
+infos.safe_grow_cleared (0);
+m_info.set_empty ();
+  }
+  vsetvl_block_info (const vsetvl_block_info &other)
+: probability (other.probability), infos (other.infos.copy ()),
+  m_info (other.m_info), m_bb (other.m_bb)
+  {}
+
+  vsetvl_info &get_header_info ()
+  {
+gcc_assert (!empty_p ());
+return infos.is_empty () ? m_info : infos[0];
+  }
+  vsetvl_info &get_footer_info ()
+  {
+gcc_assert (!empty_p ());
+return infos.is_empty () ? m_info : infos[infos.length () - 1];
+  }
+  const vsetvl_info &get_header_info () const
+  {
+gcc_assert (!empty_p ());
+return infos.is_empty () ? m_info : infos[0];
+  }
+  const vsetvl_info &get_footer_info () const
+  {
+gcc_assert (!empty_p ());
+return infos.is_empty () ? m_info : infos[infos.length () - 1];
+  }
+
+  bool empty_p () const { return infos.is_empty () && !has_info (); }
+  bool has_info () const { return !m_info.empty_p (); }
+  void set_info (const vsetvl_info &info)
+  {
+gcc_assert (infos.is_empty ());
+m_info = info;
+m_info.set_bb (m_bb);
+  }
+  void set_empty_info () { m_info.set_empty (); }
+};
+
 class demand_system
 {
 private:
diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h
index 96e36403af7..16c84e0684b 100644
--- a/gcc/config/riscv/riscv-vsetvl.h
+++ b/gcc/config/riscv/riscv-vsetvl.h
@@ -55,19 +55,5 @@ enum def_type
   CLOBBER_DEF = 1 << 4
 };

-struct vector_block_info
-{
-  /* The local_dem vector insn_info of the block.  */
-  vector_insn_info local_dem;
-
-  /* The reaching_out vector insn_info of the block.  */
-  vector_insn_info reaching_out;
-
-  /* The static execute probability of the demand info.  */
-  profile_probability probability;
-
-  vector_block_info () = default;
-};
-
 } // namespace riscv_vector
 #endif
--
2.36.3



[PATCH V2 10/14] RISC-V: P10: Cleanup helper functions

2023-10-17 Thread Lehua Ding
This sub-patch delete unused helper functions and reorganize
the position of the remain functions.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (debug): Removed.
(enum vsetvl_type): Moved.
(enum emit_type): Moved.
(vlmax_avl_p): Removed.
(vlmul_to_str): Moved.
(vlmax_avl_insn_p): Removed.
(policy_to_str): Moved.
(loop_basic_block_p): Removed.
(valid_sew_p): Removed.
(vsetvl_insn_p): Removed.
(vsetvl_vtype_change_only_p): Removed.
(after_or_same_p): Removed.
(before_p): Removed.
(anticipatable_occurrence_p): Removed.
(available_occurrence_p): Removed.
(insn_should_be_added_p):
(get_all_sets):  Moved.
(get_same_bb_set): Removed.
(gen_vsetvl_pat): Moved.
(calculate_vlmul): Moved.
(emit_vsetvl_insn): Moved.
(get_max_int_sew): New.
(eliminate_insn): Moved.
(get_max_float_sew): New.
(insert_vsetvl): Moved.
(count_regno_occurrences):
(get_vl_vtype_info): Removed.
(enum def_type): Moved.
(validate_change_or_fail): Moved.
(change_insn):  Removed.
(get_all_real_uses):  Removed.
(get_forward_read_vl_insn): Removed.
(get_backward_fault_first_load_insn): Removed.
(change_vsetvl_insn): Removed.
(avl_source_has_vsetvl_p): Removed.
(source_equal_p): Move.
(calculate_sew): Move.
(same_equiv_note_p): Move.
(get_expr_id): New.
(demands_can_be_fused_p):
(get_regno): New.
(earliest_pred_can_be_fused_p):
(vsetvl_dominated_by_p):
(get_bb_index): New.
(has_no_uses): New.

---
 gcc/config/riscv/riscv-vsetvl.cc | 1283 ++
 1 file changed, 428 insertions(+), 855 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index ac636623b3f..b5ed1ea774a 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -18,60 +18,47 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */

-/*  This pass is to Set VL/VTYPE global status for RVV instructions
-that depend on VL and VTYPE registers by Lazy code motion (LCM).
-
-Strategy:
-
--  Backward demanded info fusion within block.
-
--  Lazy code motion (LCM) based demanded info backward propagation.
-
--  RTL_SSA framework for def-use, PHI analysis.
-
--  Lazy code motion (LCM) for global VL/VTYPE optimization.
-
-Assumption:
-
--  Each avl operand is either an immediate (must be in range 0 ~ 31) or 
reg.
-
-This pass consists of 5 phases:
-
--  Phase 1 - compute VL/VTYPE demanded information within each block
-   by backward data-flow analysis.
-
--  Phase 2 - Emit vsetvl instructions within each basic block according to
-   demand, compute and save ANTLOC && AVLOC of each block.
-
--  Phase 3 - LCM Earliest-edge baseed VSETVL demand fusion.
-
--  Phase 4 - Lazy code motion including: compute local properties,
-   pre_edge_lcm and vsetvl insertion && delete edges for LCM results.
-
--  Phase 5 - Cleanup AVL operand of RVV instruction since it will not be
-   used any more and VL operand of VSETVL instruction if it is not used by
-   any non-debug instructions.
-
--  Phase 6 - DF based post VSETVL optimizations.
-
-Implementation:
-
--  The subroutine of optimize == 0 is simple_vsetvl.
-   This function simplily vsetvl insertion for each RVV
-   instruction. No optimization.
-
--  The subroutine of optimize > 0 is lazy_vsetvl.
-   This function optimize vsetvl insertion process by
-   lazy code motion (LCM) layering on RTL_SSA.
-
--  get_avl (), get_insn (), get_avl_source ():
-
-   1. get_insn () is the current instruction, find_access (get_insn
-   ())->def is the same as get_avl_source () if get_insn () demand VL.
-   2. If get_avl () is non-VLMAX REG, get_avl () == get_avl_source
-   ()->regno ().
-   3. get_avl_source ()->regno () is the REGNO that we backward propagate.
- */
+/* The values of the vl and vtype registers will affect the behavior of RVV
+   insns. That is, when we need to execute an RVV instruction, we need to set
+   the correct vl and vtype values by executing the vsetvl instruction before.
+   Executing the fewest number of vsetvl instructions while keeping the 
behavior
+   the same is the problem this pass is trying to solve. This vsetvl pass is
+   divided into 5 phases:
+
+ - Phase 1 (fuse local vsetvl infos): traverses each Basic Block, parses
+   each instruction in it that affects vl and vtype state and generates an
+   array of vsetvl_info objects. Then traverse the vsetvl_info array from
+   front to back and perform fusion according to the fusion rules. The 
fused
+   v

[PATCH V2 14/14] RISC-V: P14: Adjust and add testcases

2023-10-17 Thread Lehua Ding
This sub-patch adjust some testcases and add some bugfix
testcases.

PR target/111037
PR target/111234
PR target/111725

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalar_move-1.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-46.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-95.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/pr109743-2.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/pr109773-1.c: Adjust.
* gcc.target/riscv/rvv/base/pr111037-1.c: Moved to...
* gcc.target/riscv/rvv/vsetvl/pr111037-1.c: ...here.
* gcc.target/riscv/rvv/base/pr111037-2.c: Moved to...
* gcc.target/riscv/rvv/vsetvl/pr111037-2.c: ...here.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vsetvl-13.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vsetvl-18.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-104.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-105.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr111037-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr111037-4.c: New test.

---
 .../gcc.target/riscv/rvv/base/scalar_move-1.c |  2 +-
 .../riscv/rvv/vsetvl/avl_single-104.c | 35 +++
 .../riscv/rvv/vsetvl/avl_single-105.c | 23 
 .../riscv/rvv/vsetvl/avl_single-23.c  |  7 ++--
 .../riscv/rvv/vsetvl/avl_single-46.c  |  3 +-
 .../riscv/rvv/vsetvl/avl_single-89.c  |  8 ++---
 .../riscv/rvv/vsetvl/avl_single-95.c  |  2 +-
 .../riscv/rvv/vsetvl/imm_bb_prop-1.c  |  7 ++--
 .../gcc.target/riscv/rvv/vsetvl/pr109743-2.c  |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/pr109773-1.c  |  2 +-
 .../riscv/rvv/{base => vsetvl}/pr111037-1.c   |  0
 .../riscv/rvv/{base => vsetvl}/pr111037-2.c   |  0
 .../gcc.target/riscv/rvv/vsetvl/pr111037-3.c  | 16 +
 .../gcc.target/riscv/rvv/vsetvl/pr111037-4.c  | 16 +
 .../riscv/rvv/vsetvl/vlmax_back_prop-25.c | 10 +++---
 .../riscv/rvv/vsetvl/vlmax_back_prop-26.c | 10 +++---
 .../riscv/rvv/vsetvl/vlmax_conflict-12.c  |  1 -
 .../riscv/rvv/vsetvl/vlmax_conflict-3.c   |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-13.c   |  4 +--
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-18.c   |  4 ++-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-23.c   |  2 +-
 21 files changed, 125 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-104.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-105.c
 rename gcc/testsuite/gcc.target/riscv/rvv/{base => vsetvl}/pr111037-1.c (100%)
 rename gcc/testsuite/gcc.target/riscv/rvv/{base => vsetvl}/pr111037-2.c (100%)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-4.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-1.c
index 18349132a88..c833d8989e9 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-1.c
@@ -46,8 +46,8 @@ int32_t foo3 (int32_t *base, size_t vl)
 ** vl1re32\.v\tv[0-9]+,0\([a-x0-9]+\)
 ** vsetvli\tzero,[a-x0-9]+,e32,m1,t[au],m[au]
 ** vadd.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
-** vsetvli\tzero,[a-x0-9]+,e32,m2,t[au],m[au]
 ** vmv.x.s\t[a-x0-9]+,\s*v[0-9]+
+** vsetvli\tzero,[a-x0-9]+,e32,m2,t[au],m[au]
 ** vmv.v.x\tv[0-9]+,\s*[a-x0-9]+
 ** vmv.x.s\t[a-x0-9]+,\s*v[0-9]+
 ** ret
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-104.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-104.c
new file mode 100644
index 000..fb3577dcb98
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-104.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns 
-fno-schedule-insns2 -fno-tree-vectorize" } */
+
+#include "riscv_vector.h"
+
+void
+foo (int cond, int vl, int *in, int *out, int n)
+{
+  if (cond > 30)
+{
+  vint32m1_t v = __riscv_vle32_v_i32m1 ((int32_t *) in, vl);
+  __riscv_vse32_v_i32m1 ((int32_t *) out, v, vl);
+}
+  else if (cond < 10)
+{
+  vint8mf4_t v = __riscv_vle8_v_i8mf4 ((int8_t *) in, vl);
+  v = __riscv_vle8_v_i8mf4_tu (v, (int8_t *) in + 10, vl);
+  __riscv_vse8_v_i8mf4 ((int8_t *) out, v, vl);
+}
+  else
+{
+  vl = vl * 2;
+}
+
+  for (int i = 0; i < n; i += 1)
+{
+  

[PATCH V2 13/14] RISC-V: P13: Reorganize functions used to modify RTL

2023-10-17 Thread Lehua Ding
This sub-patch reoriganize the functions that used to modify RTL.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (has_no_uses): Moved.
(validate_change_or_fail): Moved.
(gen_vsetvl_pat): Removed.
(emit_vsetvl_insn): Removed.
(eliminate_insn): Removed.
(change_insn): Removed.
(change_vsetvl_insn): New.
(pre_vsetvl::emit_vsetvl): New.
(pre_vsetvl::remove_avl_operand): Adjust.
(pre_vsetvl::remove_unused_dest_operand): Adjust.
(pass_vsetvl::simple_vsetvl): Adjust.

---
 gcc/config/riscv/riscv-vsetvl.cc | 443 ---
 1 file changed, 176 insertions(+), 267 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index d91b0272d9f..78816cbee15 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -680,6 +680,30 @@ get_bb_index (unsigned expr_id, unsigned num_bb)
   return expr_id % num_bb;
 }

+/* Return true if the SET result is not used by any instructions.  */
+static bool
+has_no_uses (basic_block cfg_bb, rtx_insn *rinsn, int regno)
+{
+  if (bitmap_bit_p (df_get_live_out (cfg_bb), regno))
+return false;
+
+  rtx_insn *iter;
+  for (iter = NEXT_INSN (rinsn); iter && iter != NEXT_INSN (BB_END (cfg_bb));
+   iter = NEXT_INSN (iter))
+if (df_find_use (iter, regno_reg_rtx[regno]))
+  return false;
+
+  return true;
+}
+
+/* Change insn and Assert the change always happens.  */
+static void
+validate_change_or_fail (rtx object, rtx *loc, rtx new_rtx, bool in_group)
+{
+  bool change_p = validate_change (object, loc, new_rtx, in_group);
+  gcc_assert (change_p);
+}
+
 /* This flags indicates the minimum demand of the vl and vtype values by the
RVV instruction. For example, DEMAND_RATIO_P indicates that this RVV
instruction only needs the SEW/LMUL ratio to remain the same, and does not
@@ -1126,6 +1150,28 @@ public:
   }
   }

+  /* Returns the corresponding vsetvl rtx pat.  */
+  rtx get_vsetvl_pat (bool ignore_vl = false) const
+  {
+rtx avl = get_avl ();
+/* if optimization == 0 and the instruction is vmv.x.s/vfmv.f.s,
+   set the value of avl to (const_int 0) so that VSETVL PASS will
+   insert vsetvl correctly.*/
+if (!get_avl ())
+  avl = GEN_INT (0);
+rtx sew = gen_int_mode (get_sew (), Pmode);
+rtx vlmul = gen_int_mode (get_vlmul (), Pmode);
+rtx ta = gen_int_mode (get_ta (), Pmode);
+rtx ma = gen_int_mode (get_ma (), Pmode);
+
+if (change_vtype_only_p ())
+  return gen_vsetvl_vtype_change_only (sew, vlmul, ta, ma);
+else if (has_reg_vl () && !ignore_vl)
+  return gen_vsetvl (Pmode, get_vl (), avl, sew, vlmul, ta, ma);
+else
+  return gen_vsetvl_discard_result (Pmode, avl, sew, vlmul, ta, ma);
+  }
+
   bool operator== (const vsetvl_info &other) const
   {
 gcc_assert (!uninit_p () && !other.uninit_p ()
@@ -1938,199 +1984,6 @@ public:
   }
 };

-/* Emit vsetvl instruction.  */
-static rtx
-gen_vsetvl_pat (enum vsetvl_type insn_type, const vsetvl_info &info, rtx vl)
-{
-  rtx avl = info.get_avl ();
-  /* if optimization == 0 and the instruction is vmv.x.s/vfmv.f.s,
- set the value of avl to (const_int 0) so that VSETVL PASS will
- insert vsetvl correctly.*/
-  if (!info.get_avl ())
-avl = GEN_INT (0);
-  rtx sew = gen_int_mode (info.get_sew (), Pmode);
-  rtx vlmul = gen_int_mode (info.get_vlmul (), Pmode);
-  rtx ta = gen_int_mode (info.get_ta (), Pmode);
-  rtx ma = gen_int_mode (info.get_ma (), Pmode);
-
-  if (insn_type == VSETVL_NORMAL)
-{
-  gcc_assert (vl != NULL_RTX);
-  return gen_vsetvl (Pmode, vl, avl, sew, vlmul, ta, ma);
-}
-  else if (insn_type == VSETVL_VTYPE_CHANGE_ONLY)
-return gen_vsetvl_vtype_change_only (sew, vlmul, ta, ma);
-  else
-return gen_vsetvl_discard_result (Pmode, avl, sew, vlmul, ta, ma);
-}
-
-static rtx
-gen_vsetvl_pat (rtx_insn *rinsn, const vsetvl_info &info, rtx vl = NULL_RTX)
-{
-  rtx new_pat;
-  vsetvl_info new_info = info;
-  /* For vmv.x.s, use 0 for avl.  */
-  if (!info.get_avl ())
-{
-  new_info.set_avl (const0_rtx);
-  new_info.set_avl_def (nullptr);
-}
-
-  if (vl)
-new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, vl);
-  else
-{
-  if (vsetvl_insn_p (rinsn) && !info.change_vtype_only_p ())
-   new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, get_vl (rinsn));
-  else if (info.change_vtype_only_p ()
-  || INSN_CODE (rinsn) == CODE_FOR_vsetvl_vtype_change_only)
-   new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, new_info, NULL_RTX);
-  else
-   new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, new_info, NULL_RTX);
-}
-  return new_pat;
-}
-
-static void
-emit_vsetvl_insn (enum vsetvl_type insn_type, enum emit_type emit_type,
- const vsetvl_info &info, rtx vl, rtx_insn *rinsn)
-{
-  rtx pat = gen_vsetvl_pat (insn_type, info, vl);
-
-  if (emit_type == EMIT_DIRECT)
-

Re: [PATCH V2 00/14] Refactor and cleanup vsetvl pass

2023-10-17 Thread Lehua Ding
-O1 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O2 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -Os 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-65.c   -O2 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-65.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-65.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O0 scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O1 scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O2 scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-not vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-not vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O3 -g 
scan-assembler-not vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -Os scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-10.c   -O3 -g 
scan-assembler-times vsetvli 15
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-11.c   -O3 -g 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-12.c   -O3 -g 
scan-assembler-times vsetvli 9
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-13.c   -O3 -g 
scan-assembler-times vsetvli 9
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-14.c   -O3 -g 
scan-assembler-times vsetvli 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-15.c   -O3 -g 
scan-assembler-times vsetvli 4
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-16.c   -O3 -g 
scan-assembler-times vsetvli 15
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-18.c   -O3 -g 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-19.c   -O3 -g 
scan-assembler-times vsetvli 15
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-9.c   -O3 -g 
scan-assembler-times vsetvli 15


On 10/17/23 04:34, Lehua Ding wrote:
This patch refactors and cleanups the vsetvl pass in order to make the 
code

easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 
only maintain
    and modify this virtual CFG. Phase 4 performs insertion, 
modification and
    deletion of vsetvl insns based on the virtual CFG. The Basic block 
in the
    virtual CFG is called vsetvl_block_info and the vsetvl information 
inside

    is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand 
system,

    this Phase only fuse local vsetvl info in forward direction.
3. Refactor Phase 3, change the logic for determining whether to 
uplift vsetvl
    info to a pred basic block to a more unified method that there is 
a vsetvl

    info in the vsetvl defintion reaching in compatible with it.
4. Place all modification operations to the RTL in Phase 4 and Phase 5.
    Phase 4 is responsible for inserting, modifying and deleting vsetvl
    instructions based on fully optimized vsetvl infos. Phase 5 
removes the avl

    operand from the RVV instruction and removes the unused dest operand
    register from the vsetvl insns.

These modifications resulted in some testcases needing to be updated. 
The reasons

for updating are summarized below:

1. more optimized
    vlmax_back_prop-25.c/vlmax_back_prop-26.c/vlmax_conflict-3.c/
    vlmax_conflict-12.c/vsetvl-13.c/vsetvl-23.c/
    avl_single-23.c/avl_single-89.c/avl_single-95.c/pr109773-1.c
2. less unnecessary fusion
    avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c
3. local fuse direction (backward -> forward)
    scalar_move-1.c/
4. add some bugfix testcases.
    pr111037-3.c/pr111037-4.c
    avl_single-89.c

PR target/111037
    PR target/111234
PR target/111725


Lehua Ding (14):
   RISC-V: P1: Refactor avl_info/vl_vtype_info/vector_insn_info
   RISC-V: P2: Refactor and cleanup demand system
   RISC-V: P3: Refactor class vector_infos_manager to pre_vsetvl
   RISC-V: P4: move method from class pass_vsetvl to pre_vsetvl
   RISC-V: P5: combine phase 1 and 2 into a single pahse
   RISC-V: P6: Add compute reaching definition data flow
   RISC-V: P7: Move earliest fuse and lcm code to pre_vsetvl class
   RISC-V: P8: Unified insert and delete of vsetvl insn into Phase 4
   RISC-V: P9: Cleanup post optimize phase
   RISC-V: P10: Cleanup helper functions
   RISC-V: P11:  Refactor v

Re: [PATCH] RISC-V: Optimize consecutive permutation index pattern by vrgather.vi/vx

2023-10-18 Thread Lehua Ding

Committed, thanks Robin.

On 2023/10/18 15:53, Robin Dapp wrote:

LGTM.

Regards
  Robin


--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai


Re: [PATCH V2 00/14] Refactor and cleanup vsetvl pass

2023-10-18 Thread Lehua Ding
s vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-37.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-37.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-37.c   -Os 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-38.c   -O1 
scan-assembler-times vsetvli 4
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-38.c   -O2 
scan-assembler-times vsetvli 4
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-38.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 4
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-38.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 4
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-47.c   -O1 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-47.c   -O2 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-47.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-47.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-47.c   -Os 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-48.c   -O1 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-48.c   -O2 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-48.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-48.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O1 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O2 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -Os 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-65.c   -O2 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-65.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-65.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O0 scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O1 scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O2 scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-not vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-not vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O3 -g 
scan-assembler-not vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -Os scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-10.c   -O3 -g 
scan-assembler-times vsetvli 15
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-11.c   -O3 -g 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-12.c   -O3 -g 
scan-assembler-times vsetvli 9
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-13.c   -O3 -g 
scan-assembler-times vsetvli 9
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-14.c   -O3 -g 
scan-assembler-times vsetvli 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-15.c   -O3 -g 
scan-assembler-times vsetvli 4
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-16.c   -O3 -g 
scan-assembler-times vsetvli 15
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-18.c   -O3 -g 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-19.c   -O3 -g 
scan-assembler-times vsetvli 15
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-9.c   -O3 -g 
scan-assembler-times vsetvli 15


On 10/17/23 04:34, Lehua Ding wrote:
This patch refactors and cleanups the vsetvl pass in order to make the 
code

easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 
only maintain
    and modify this virtual CFG. Phase 4 performs insertion, 
modification and
    deletion of vsetvl insns based on the virtual CFG. The Basic block 
in the
    virtual CFG is called vsetvl_block_info and the vsetvl information 
inside

    is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand 
s

[PATCH V3 00/11] Refactor and cleanup vsetvl pass

2023-10-19 Thread Lehua Ding
This patch refactors and cleanups the vsetvl pass in order to make the code
easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only maintain
   and modify this virtual CFG. Phase 4 performs insertion, modification and
   deletion of vsetvl insns based on the virtual CFG. The Basic block in the
   virtual CFG is called vsetvl_block_info and the vsetvl information inside
   is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand system,
   this Phase only fuse local vsetvl info in forward direction.
3. Refactor Phase 3, change the logic for determining whether to uplift vsetvl
   info to a pred basic block to a more unified method that there is a vsetvl
   info in the vsetvl defintion reaching in compatible with it.
4. Place all modification operations to the RTL in Phase 4 and Phase 5.
   Phase 4 is responsible for inserting, modifying and deleting vsetvl
   instructions based on fully optimized vsetvl infos. Phase 5 removes the avl
   operand from the RVV instruction and removes the unused dest operand
   register from the vsetvl insns.

These modifications resulted in some testcases needing to be updated. The 
reasons
for updating are summarized below:

1. more optimized
   vlmax_back_prop-25.c/vlmax_back_prop-26.c/vlmax_conflict-3.c/
   vlmax_conflict-12.c/vsetvl-13.c/vsetvl-23.c/
   avl_single-23.c/avl_single-89.c/avl_single-95.c/pr109773-1.c
2. less unnecessary fusion
   avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c
3. local fuse direction (backward -> forward)
   scalar_move-1.c/
4. add some bugfix testcases.
   pr111037-3.c/pr111037-4.c
   avl_single-89.c

PR target/111037
PR target/111234
PR target/111725

Lehua Ding (11):
  RISC-V: P1: Refactor
avl_info/vl_vtype_info/vector_insn_info/vector_block_info
  RISC-V: P2: Refactor and cleanup demand system
  RISC-V: P3: Refactor vector_infos_manager
  RISC-V: P4: move method from pass_vsetvl to pre_vsetvl
  RISC-V: P5: combine phase 1 and 2
  RISC-V: P6: Add computing reaching definition data flow
  RISC-V: P7: Move earliest fuse and lcm code to pre_vsetvl class
  RISC-V: P8: Refactor emit-vsetvl phase and delete post optimization
  RISC-V: P9: Cleanup and reorganize helper functions
  RISC-V: P10: Delete riscv-vsetvl.h and adjust riscv-vsetvl.def
  RISC-V: P11: Adjust and add testcases

 gcc/config/riscv/riscv-vsetvl.cc  | 6502 +++--
 gcc/config/riscv/riscv-vsetvl.def |  641 +-
 gcc/config/riscv/riscv-vsetvl.h   |  488 --
 gcc/config/riscv/t-riscv  |2 +-
 .../gcc.target/riscv/rvv/base/scalar_move-1.c |2 +-
 .../riscv/rvv/vsetvl/avl_single-104.c |   35 +
 .../riscv/rvv/vsetvl/avl_single-105.c |   23 +
 .../riscv/rvv/vsetvl/avl_single-106.c |   34 +
 .../riscv/rvv/vsetvl/avl_single-107.c |   41 +
 .../riscv/rvv/vsetvl/avl_single-108.c |   41 +
 .../riscv/rvv/vsetvl/avl_single-109.c |   45 +
 .../riscv/rvv/vsetvl/avl_single-23.c  |7 +-
 .../riscv/rvv/vsetvl/avl_single-46.c  |3 +-
 .../riscv/rvv/vsetvl/avl_single-84.c  |5 +-
 .../riscv/rvv/vsetvl/avl_single-89.c  |8 +-
 .../riscv/rvv/vsetvl/avl_single-95.c  |2 +-
 .../riscv/rvv/vsetvl/imm_bb_prop-1.c  |7 +-
 .../gcc.target/riscv/rvv/vsetvl/pr109743-2.c  |2 +-
 .../gcc.target/riscv/rvv/vsetvl/pr109773-1.c  |2 +-
 .../riscv/rvv/{base => vsetvl}/pr111037-1.c   |0
 .../riscv/rvv/{base => vsetvl}/pr111037-2.c   |0
 .../gcc.target/riscv/rvv/vsetvl/pr111037-3.c  |   16 +
 .../gcc.target/riscv/rvv/vsetvl/pr111037-4.c  |   16 +
 .../riscv/rvv/vsetvl/vlmax_back_prop-25.c |   10 +-
 .../riscv/rvv/vsetvl/vlmax_back_prop-26.c |   10 +-
 .../riscv/rvv/vsetvl/vlmax_conflict-12.c  |1 -
 .../riscv/rvv/vsetvl/vlmax_conflict-3.c   |2 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-13.c   |4 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-18.c   |4 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-23.c   |2 +-
 30 files changed, 3263 insertions(+), 4692 deletions(-)
 delete mode 100644 gcc/config/riscv/riscv-vsetvl.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-104.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-105.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-106.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-107.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-108.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-109.c
 rename gcc/testsuite/gcc.target/riscv/rvv/{base => vsetvl}/pr111037-1.c (100%)
 rename gcc/testsuite/gcc.target/riscv/rvv/{base => vsetvl}/pr111037-2.c (100%)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-3.c
 create mode 10064

[PATCH V3 02/11] RISC-V: P2: Refactor and cleanup demand system

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (incompatible_avl_p): Removed.
(different_sew_p): Removed.
(different_lmul_p): Removed.
(different_ratio_p): Removed.
(different_tail_policy_p): Removed.
(different_mask_policy_p): Removed.
(possible_zero_avl_p): Removed.
(second_ratio_invalid_for_first_sew_p): Removed.
(second_ratio_invalid_for_first_lmul_p): Removed.
(float_insn_valid_sew_p): Removed.
(second_sew_less_than_first_sew_p): Removed.
(first_sew_less_than_second_sew_p): Removed.
(compare_lmul): Removed.
(second_lmul_less_than_first_lmul_p): Removed.
(second_ratio_less_than_first_ratio_p): Removed.
(DEF_INCOMPATIBLE_COND): Removed.
(greatest_sew): Removed.
(first_sew): Removed.
(second_sew): Removed.
(first_vlmul): Removed.
(second_vlmul): Removed.
(first_ratio): Removed.
(second_ratio): Removed.
(vlmul_for_first_sew_second_ratio): Removed.
(vlmul_for_greatest_sew_second_ratio): Removed.
(ratio_for_second_sew_first_vlmul): Removed.
(DEF_SEW_LMUL_FUSE_RULE): Removed.
(always_unavailable): Removed.
(avl_unavailable_p): Removed.
(sew_unavailable_p): Removed.
(lmul_unavailable_p): Removed.
(ge_sew_unavailable_p): Removed.
(ge_sew_lmul_unavailable_p): Removed.
(ge_sew_ratio_unavailable_p): Removed.
(DEF_UNAVAILABLE_COND): Removed.
(same_sew_lmul_demand_p): Removed.
(propagate_avl_across_demands_p): Removed.
(reg_available_p): Removed.
(support_relaxed_compatible_p): Removed.
(count_regno_occurrences): Removed.
(demands_can_be_fused_p): Removed.
(earliest_pred_can_be_fused_p): Removed.
(vsetvl_dominated_by_p): Removed.
(class demand_system): New.
(DEF_SEW_LMUL_RULE): New.
(DEF_POLICY_RULE): New.
(DEF_AVL_RULE): New.

---
 gcc/config/riscv/riscv-vsetvl.cc | 1158 +-
 1 file changed, 668 insertions(+), 490 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 8908071dc0d..c9f2f653247 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1091,496 +1091,6 @@ calculate_vlmul (unsigned int sew, unsigned int ratio)
   return LMUL_RESERVED;
 }
 
-static bool
-incompatible_avl_p (const vector_insn_info &info1,
-   const vector_insn_info &info2)
-{
-  return !info1.compatible_avl_p (info2) && !info2.compatible_avl_p (info1);
-}
-
-static bool
-different_sew_p (const vector_insn_info &info1, const vector_insn_info &info2)
-{
-  return info1.get_sew () != info2.get_sew ();
-}
-
-static bool
-different_lmul_p (const vector_insn_info &info1, const vector_insn_info &info2)
-{
-  return info1.get_vlmul () != info2.get_vlmul ();
-}
-
-static bool
-different_ratio_p (const vector_insn_info &info1, const vector_insn_info 
&info2)
-{
-  return info1.get_ratio () != info2.get_ratio ();
-}
-
-static bool
-different_tail_policy_p (const vector_insn_info &info1,
-const vector_insn_info &info2)
-{
-  return info1.get_ta () != info2.get_ta ();
-}
-
-static bool
-different_mask_policy_p (const vector_insn_info &info1,
-const vector_insn_info &info2)
-{
-  return info1.get_ma () != info2.get_ma ();
-}
-
-static bool
-possible_zero_avl_p (const vector_insn_info &info1,
-const vector_insn_info &info2)
-{
-  return !info1.has_non_zero_avl () || !info2.has_non_zero_avl ();
-}
-
-static bool
-second_ratio_invalid_for_first_sew_p (const vector_insn_info &info1,
- const vector_insn_info &info2)
-{
-  return calculate_vlmul (info1.get_sew (), info2.get_ratio ())
-== LMUL_RESERVED;
-}
-
-static bool
-second_ratio_invalid_for_first_lmul_p (const vector_insn_info &info1,
-  const vector_insn_info &info2)
-{
-  return calculate_sew (info1.get_vlmul (), info2.get_ratio ()) == 0;
-}
-
-static bool
-float_insn_valid_sew_p (const vector_insn_info &info, unsigned int sew)
-{
-  if (info.get_insn () && info.get_insn ()->is_real ()
-  && get_attr_type (info.get_insn ()->rtl ()) == TYPE_VFMOVFV)
-{
-  if (sew == 16)
-   return TARGET_VECTOR_ELEN_FP_16;
-  else if (sew == 32)
-   return TARGET_VECTOR_ELEN_FP_32;
-  else if (sew == 64)
-   return TARGET_VECTOR_ELEN_FP_64;
-}
-  return true;
-}
-
-static bool
-second_sew_less_than_first_sew_p (const vector_insn_info &info1,
- const vector_insn_info &info2)
-{
-  return info2.get_sew () < info1.get_sew ()
-|| !float_insn_valid_sew_p (info1, info2.get_sew ());
-}
-
-static bool
-first_sew_less_than_second_sew_p (const vector_insn_info &info1,
- const vector_insn_i

[PATCH V3 04/11] RISC-V: P4: move method from pass_vsetvl to pre_vsetvl

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::get_vector_info): Removed.
(pass_vsetvl::get_block_info): Removed.
(pass_vsetvl::update_vector_info): Removed.
(pass_vsetvl::update_block_info): Removed.
(pass_vsetvl::simple_vsetvl): Removed.
(pass_vsetvl::lazy_vsetvl): Removed.
(pass_vsetvl::execute): Removed.
(make_pass_vsetvl): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 207 ++-
 1 file changed, 96 insertions(+), 111 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index c73a84cb6bd..f8b708c248a 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2721,6 +2721,7 @@ public:
   }
 };
 
+
 const pass_data pass_data_vsetvl = {
   RTL_PASS, /* type */
   "vsetvl", /* name */
@@ -2736,54 +2737,8 @@ const pass_data pass_data_vsetvl = {
 class pass_vsetvl : public rtl_opt_pass
 {
 private:
-  vector_infos_manager *m_vector_manager;
-
-  const vector_insn_info &get_vector_info (const rtx_insn *) const;
-  const vector_insn_info &get_vector_info (const insn_info *) const;
-  const vector_block_info &get_block_info (const basic_block) const;
-  const vector_block_info &get_block_info (const bb_info *) const;
-  vector_block_info &get_block_info (const basic_block);
-  vector_block_info &get_block_info (const bb_info *);
-  void update_vector_info (const insn_info *, const vector_insn_info &);
-  void update_block_info (int, profile_probability, const vector_insn_info &);
-
-  void simple_vsetvl (void) const;
-  void lazy_vsetvl (void);
-
-  /* Phase 1.  */
-  void compute_local_backward_infos (const bb_info *);
-
-  /* Phase 2.  */
-  bool need_vsetvl (const vector_insn_info &, const vector_insn_info &) const;
-  void transfer_before (vector_insn_info &, insn_info *) const;
-  void transfer_after (vector_insn_info &, insn_info *) const;
-  void emit_local_forward_vsetvls (const bb_info *);
-
-  /* Phase 3.  */
-  bool earliest_fusion (void);
-  void vsetvl_fusion (void);
-
-  /* Phase 4.  */
-  void prune_expressions (void);
-  void compute_local_properties (void);
-  bool can_refine_vsetvl_p (const basic_block, const vector_insn_info &) const;
-  void refine_vsetvls (void) const;
-  void cleanup_vsetvls (void);
-  bool commit_vsetvls (void);
-  void pre_vsetvl (void);
-
-  /* Phase 5.  */
-  rtx_insn *get_vsetvl_at_end (const bb_info *, vector_insn_info *) const;
-  void local_eliminate_vsetvl_insn (const bb_info *) const;
-  bool global_eliminate_vsetvl_insn (const bb_info *) const;
-  void ssa_post_optimization (void) const;
-
-  /* Phase 6.  */
-  void df_post_optimization (void) const;
-
-  void init (void);
-  void done (void);
-  void compute_probabilities (void);
+  void simple_vsetvl ();
+  void lazy_vsetvl ();
 
 public:
   pass_vsetvl (gcc::context *ctxt) : rtl_opt_pass (pass_data_vsetvl, ctxt) {}
@@ -2793,69 +2748,11 @@ public:
   virtual unsigned int execute (function *) final override;
 }; // class pass_vsetvl
 
-const vector_insn_info &
-pass_vsetvl::get_vector_info (const rtx_insn *i) const
-{
-  return m_vector_manager->vector_insn_infos[INSN_UID (i)];
-}
-
-const vector_insn_info &
-pass_vsetvl::get_vector_info (const insn_info *i) const
-{
-  return m_vector_manager->vector_insn_infos[i->uid ()];
-}
-
-const vector_block_info &
-pass_vsetvl::get_block_info (const basic_block bb) const
-{
-  return m_vector_manager->vector_block_infos[bb->index];
-}
-
-const vector_block_info &
-pass_vsetvl::get_block_info (const bb_info *bb) const
-{
-  return m_vector_manager->vector_block_infos[bb->index ()];
-}
-
-vector_block_info &
-pass_vsetvl::get_block_info (const basic_block bb)
-{
-  return m_vector_manager->vector_block_infos[bb->index];
-}
-
-vector_block_info &
-pass_vsetvl::get_block_info (const bb_info *bb)
-{
-  return m_vector_manager->vector_block_infos[bb->index ()];
-}
-
-void
-pass_vsetvl::update_vector_info (const insn_info *i,
-const vector_insn_info &new_info)
-{
-  m_vector_manager->vector_insn_infos[i->uid ()] = new_info;
-}
-
-void
-pass_vsetvl::update_block_info (int index, profile_probability prob,
-   const vector_insn_info &new_info)
-{
-  m_vector_manager->vector_block_infos[index].probability = prob;
-  if (m_vector_manager->vector_block_infos[index].local_dem
-  == m_vector_manager->vector_block_infos[index].reaching_out)
-m_vector_manager->vector_block_infos[index].local_dem = new_info;
-  m_vector_manager->vector_block_infos[index].reaching_out = new_info;
-}
-
-/* Simple m_vsetvl_insert vsetvl for optimize == 0.  */
 void
-pass_vsetvl::simple_vsetvl (void) const
+pass_vsetvl::simple_vsetvl ()
 {
   if (dump_file)
-fprintf (dump_file,
-"\nEntering Simple VSETVL PASS and Handling %d basic blocks for "
-"function:%s\n",
-n_basic_blocks_for_fn (cfun), function_name (cfun));
+   

[PATCH V3 01/11] RISC-V: P1: Refactor avl_info/vl_vtype_info/vector_insn_info/vector_block_info

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (avl_info::avl_info): Removed.
(avl_info::single_source_equal_p): Removed.
(avl_info::multiple_source_equal_p): Removed.
(avl_info::operator=): Removed.
(avl_info::operator==): Removed.
(avl_info::operator!=): Removed.
(avl_info::has_non_zero_avl): Removed.
(vl_vtype_info::vl_vtype_info): Removed.
(vl_vtype_info::operator==): Removed.
(vl_vtype_info::operator!=): Removed.
(vl_vtype_info::same_avl_p): Removed.
(vl_vtype_info::same_vtype_p): Removed.
(enum demand_flags): New enum.
(vl_vtype_info::same_vlmax_p): Removed.
(vector_insn_info::operator>=): Removed.
(enum class): New demand_type.
(vector_insn_info::operator==): Removed.
(vector_insn_info::parse_insn): Removed.
(class vsetvl_info): New class.
(vector_insn_info::compatible_p): Removed.
(vector_insn_info::skip_avl_compatible_p): Removed.
(vector_insn_info::compatible_avl_p): Removed.
(vector_insn_info::compatible_vtype_p): Removed.
(vector_insn_info::available_p): Removed.
(vector_insn_info::fuse_avl): Removed.
(vector_insn_info::fuse_sew_lmul): Removed.
(vector_insn_info::fuse_tail_policy): Removed.
(vector_insn_info::fuse_mask_policy): Removed.
(vector_insn_info::local_merge): Removed.
(vector_insn_info::global_merge): Removed.
(vector_insn_info::get_avl_or_vl_reg): Removed.
(vector_insn_info::update_fault_first_load_avl): Removed.
(vlmul_to_str): Removed.
(policy_to_str): Removed.
(vector_insn_info::dump): Removed.
(class vsetvl_block_info): New class.

---
 gcc/config/riscv/riscv-vsetvl.cc | 1401 +-
 1 file changed, 602 insertions(+), 799 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 4b06d93e7f9..8908071dc0d 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1581,827 +1581,630 @@ vsetvl_dominated_by_p (const basic_block cfg_bb,
   return true;
 }
 
-avl_info::avl_info (const avl_info &other)
-{
-  m_value = other.get_value ();
-  m_source = other.get_source ();
-}
-
-avl_info::avl_info (rtx value_in, set_info *source_in)
-  : m_value (value_in), m_source (source_in)
-{}
-
-bool
-avl_info::single_source_equal_p (const avl_info &other) const
-{
-  set_info *set1 = m_source;
-  set_info *set2 = other.get_source ();
-  insn_info *insn1 = extract_single_source (set1);
-  insn_info *insn2 = extract_single_source (set2);
-  if (!insn1 || !insn2)
-return false;
-  return source_equal_p (insn1, insn2);
-}
-
-bool
-avl_info::multiple_source_equal_p (const avl_info &other) const
-{
-  /* When the def info is same in RTL_SSA namespace, it's safe
- to consider they are avl compatible.  */
-  if (m_source == other.get_source ())
-return true;
-
-  /* We only consider handle PHI node.  */
-  if (!m_source->insn ()->is_phi () || !other.get_source ()->insn ()->is_phi 
())
-return false;
-
-  phi_info *phi1 = as_a (m_source);
-  phi_info *phi2 = as_a (other.get_source ());
-
-  if (phi1->is_degenerate () && phi2->is_degenerate ())
-{
-  /* Degenerate PHI means the PHI node only have one input.  */
-
-  /* If both PHI nodes have the same single input in use list.
-We consider they are AVL compatible.  */
-  if (phi1->input_value (0) == phi2->input_value (0))
-   return true;
-}
-  /* TODO: We can support more optimization cases in the future.  */
-  return false;
-}
-
-avl_info &
-avl_info::operator= (const avl_info &other)
-{
-  m_value = other.get_value ();
-  m_source = other.get_source ();
-  return *this;
-}
-
-bool
-avl_info::operator== (const avl_info &other) const
-{
-  if (!m_value)
-return !other.get_value ();
-  if (!other.get_value ())
-return false;
-
-  if (GET_CODE (m_value) != GET_CODE (other.get_value ()))
-return false;
-
-  /* Handle CONST_INT AVL.  */
-  if (CONST_INT_P (m_value))
-return INTVAL (m_value) == INTVAL (other.get_value ());
-
-  /* Handle VLMAX AVL.  */
-  if (vlmax_avl_p (m_value))
-return vlmax_avl_p (other.get_value ());
-  if (vlmax_avl_p (other.get_value ()))
-return false;
-
-  /* If any source is undef value, we think they are not equal.  */
-  if (!m_source || !other.get_source ())
-return false;
-
-  /* If both sources are single source (defined by a single real RTL)
- and their definitions are same.  */
-  if (single_source_equal_p (other))
-return true;
-
-  return multiple_source_equal_p (other);
-}
-
-bool
-avl_info::operator!= (const avl_info &other) const
-{
-  return !(*this == other);
-}
-
-bool
-avl_info::has_non_zero_avl () const
-{
-  if (has_avl_imm ())
-return INTVAL (get_value ()) > 0;
-  if (has_avl_reg ())
-return vlmax_avl_p (get_value ());
-  return false;
-}
-
-/* Ini

[PATCH V3 03/11] RISC-V: P3: Refactor vector_infos_manager

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc 
(vector_infos_manager::vector_infos_manager): Removed.
(vector_infos_manager::create_expr): Removed.
(class pre_vsetvl): New class.
(vector_infos_manager::get_expr_id): Removed.
(vector_infos_manager::all_same_ratio_p): Removed.
(vector_infos_manager::all_avail_in_compatible_p): Removed.
(vector_infos_manager::all_same_avl_p): Removed.
(vector_infos_manager::expr_set_num): Removed.
(vector_infos_manager::release): Removed.
(vector_infos_manager::create_bitmap_vectors): Removed.
(vector_infos_manager::free_bitmap_vectors): Removed.
(vector_infos_manager::dump): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 674 ++-
 1 file changed, 307 insertions(+), 367 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index c9f2f653247..c73a84cb6bd 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2384,402 +2384,342 @@ public:
   }
 };
 
-vector_infos_manager::vector_infos_manager ()
-{
-  vector_edge_list = nullptr;
-  vector_kill = nullptr;
-  vector_del = nullptr;
-  vector_insert = nullptr;
-  vector_antic = nullptr;
-  vector_transp = nullptr;
-  vector_comp = nullptr;
-  vector_avin = nullptr;
-  vector_avout = nullptr;
-  vector_antin = nullptr;
-  vector_antout = nullptr;
-  vector_earliest = nullptr;
-  vector_insn_infos.safe_grow_cleared (get_max_uid ());
-  vector_block_infos.safe_grow_cleared (last_basic_block_for_fn (cfun));
-  if (!optimize)
-{
-  basic_block cfg_bb;
-  rtx_insn *rinsn;
-  FOR_ALL_BB_FN (cfg_bb, cfun)
-   {
- vector_block_infos[cfg_bb->index].local_dem = vector_insn_info ();
- vector_block_infos[cfg_bb->index].reaching_out = vector_insn_info ();
- FOR_BB_INSNS (cfg_bb, rinsn)
-   vector_insn_infos[INSN_UID (rinsn)].parse_insn (rinsn);
-   }
-}
-  else
-{
-  for (const bb_info *bb : crtl->ssa->bbs ())
-   {
- vector_block_infos[bb->index ()].local_dem = vector_insn_info ();
- vector_block_infos[bb->index ()].reaching_out = vector_insn_info ();
- for (insn_info *insn : bb->real_insns ())
-   vector_insn_infos[insn->uid ()].parse_insn (insn);
- vector_block_infos[bb->index ()].probability = profile_probability ();
-   }
-}
-}
 
-void
-vector_infos_manager::create_expr (vector_insn_info &info)
+class pre_vsetvl
 {
-  for (size_t i = 0; i < vector_exprs.length (); i++)
-if (*vector_exprs[i] == info)
-  return;
-  vector_exprs.safe_push (&info);
-}
-
-size_t
-vector_infos_manager::get_expr_id (const vector_insn_info &info) const
-{
-  for (size_t i = 0; i < vector_exprs.length (); i++)
-if (*vector_exprs[i] == info)
-  return i;
-  gcc_unreachable ();
-}
-
-auto_vec
-vector_infos_manager::get_all_available_exprs (
-  const vector_insn_info &info) const
-{
-  auto_vec available_list;
-  for (size_t i = 0; i < vector_exprs.length (); i++)
-if (info.available_p (*vector_exprs[i]))
-  available_list.safe_push (i);
-  return available_list;
-}
-
-bool
-vector_infos_manager::all_same_ratio_p (sbitmap bitdata) const
-{
-  if (bitmap_empty_p (bitdata))
-return false;
+private:
+  demand_system m_dem;
+  auto_vec m_vector_block_infos;
 
-  int ratio = -1;
-  unsigned int bb_index;
-  sbitmap_iterator sbi;
+  /* data for avl reaching defintion.  */
+  sbitmap m_avl_regs;
+  sbitmap *m_avl_def_in;
+  sbitmap *m_avl_def_out;
+  sbitmap *m_reg_def_loc;
+
+  /* data for vsetvl info reaching defintion.  */
+  vsetvl_info m_unknow_info;
+  auto_vec m_vsetvl_def_exprs;
+  sbitmap *m_vsetvl_def_in;
+  sbitmap *m_vsetvl_def_out;
+
+  /* data for lcm */
+  auto_vec m_exprs;
+  sbitmap *m_avloc;
+  sbitmap *m_avin;
+  sbitmap *m_avout;
+  sbitmap *m_kill;
+  sbitmap *m_antloc;
+  sbitmap *m_transp;
+  sbitmap *m_insert;
+  sbitmap *m_del;
+  struct edge_list *m_edges;
+
+  auto_vec m_delete_list;
+
+  vsetvl_block_info &get_block_info (const bb_info *bb)
+  {
+return m_vector_block_infos[bb->index ()];
+  }
+  const vsetvl_block_info &get_block_info (const basic_block bb) const
+  {
+return m_vector_block_infos[bb->index];
+  }
 
-  EXECUTE_IF_SET_IN_BITMAP (bitdata, 0, bb_index, sbi)
-{
-  if (ratio == -1)
-   ratio = vector_exprs[bb_index]->get_ratio ();
-  else if (vector_exprs[bb_index]->get_ratio () != ratio)
-   return false;
-}
-  return true;
-}
+  vsetvl_block_info &get_block_info (const basic_block bb)
+  {
+return m_vector_block_infos[bb->index];
+  }
 
-/* Return TRUE if the incoming vector configuration state
-   to CFG_BB is compatible with the vector configuration
-   state in CFG_BB, FALSE otherwise.  */
-bool
-vector_infos_manager::all_avail_in_compatible_p (const basic_block cfg_bb) 
const
-{
-  const auto &info = vector_block_infos[cfg_bb->index].local_dem;
-  sbit

[PATCH V3 05/11] RISC-V: P5: Combine phase 1 and 2

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::fuse_local_vsetvl_info): 
New.
(pass_vsetvl::compute_local_backward_infos): Removed.
(pass_vsetvl::need_vsetvl): Removed.
(pass_vsetvl::transfer_before): Removed.
(pass_vsetvl::transfer_after): Removed.
(pass_vsetvl::emit_local_forward_vsetvls): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 270 ++-
 1 file changed, 124 insertions(+), 146 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index f8b708c248a..dad3d7c941e 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2722,6 +2722,130 @@ public:
 };
 
 
+void
+pre_vsetvl::fuse_local_vsetvl_info ()
+{
+  m_reg_def_loc
+= sbitmap_vector_alloc (last_basic_block_for_fn (cfun), GP_REG_LAST + 1);
+  bitmap_vector_clear (m_reg_def_loc, last_basic_block_for_fn (cfun));
+  bitmap_ones (m_reg_def_loc[ENTRY_BLOCK_PTR_FOR_FN (cfun)->index]);
+
+  for (bb_info *bb : crtl->ssa->bbs ())
+{
+  auto &block_info = get_block_info (bb);
+  block_info.m_bb = bb;
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "  Try fuse basic block %d\n", bb->index ());
+   }
+  auto_vec infos;
+  for (insn_info *insn : bb->real_nondebug_insns ())
+   {
+ vsetvl_info curr_info = vsetvl_info (insn);
+ if (curr_info.valid_p () || curr_info.unknown_p ())
+   infos.safe_push (curr_info);
+
+ /* Collecting GP registers modified by the current bb.  */
+ if (insn->is_real ())
+   for (def_info *def : insn->defs ())
+ if (def->is_reg () && GP_REG_P (def->regno ()))
+   bitmap_set_bit (m_reg_def_loc[bb->index ()], def->regno ());
+   }
+
+  vsetvl_info prev_info = vsetvl_info ();
+  prev_info.set_empty ();
+  for (auto &curr_info : infos)
+   {
+ if (prev_info.empty_p ())
+   prev_info = curr_info;
+ else if ((curr_info.unknown_p () && prev_info.valid_p ())
+  || (curr_info.valid_p () && prev_info.unknown_p ()))
+   {
+ block_info.infos.safe_push (prev_info);
+ prev_info = curr_info;
+   }
+ else if (curr_info.valid_p () && prev_info.valid_p ())
+   {
+ if (m_dem.available_p (prev_info, curr_info))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file,
+  "Ignore curr info since prev info "
+  "available with it:\n");
+ fprintf (dump_file, "  prev_info: ");
+ prev_info.dump (dump_file, "");
+ fprintf (dump_file, "  curr_info: ");
+ curr_info.dump (dump_file, "");
+ fprintf (dump_file, "\n");
+   }
+ if (!curr_info.vl_use_by_non_rvv_insn_p ()
+ && vsetvl_insn_p (curr_info.get_insn ()->rtl ()))
+   m_delete_list.safe_push (curr_info);
+
+ if (curr_info.get_read_vl_insn ())
+   prev_info.set_read_vl_insn (curr_info.get_read_vl_insn ());
+   }
+ else if (m_dem.compatible_p (prev_info, curr_info))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "Fuse curr info since prev info "
+ "compatible with it:\n");
+ fprintf (dump_file, "  prev_info: ");
+ prev_info.dump (dump_file, "");
+ fprintf (dump_file, "  curr_info: ");
+ curr_info.dump (dump_file, "");
+   }
+ m_dem.merge (prev_info, curr_info);
+ if (curr_info.get_read_vl_insn ())
+   prev_info.set_read_vl_insn (curr_info.get_read_vl_insn ());
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "  prev_info after fused: ");
+ prev_info.dump (dump_file, "");
+ fprintf (dump_file, "\n");
+   }
+   }
+ else
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file,
+  "Cannot fuse uncompatible infos:\n");
+ fprintf (dump_file, "  prev_info: ");
+ prev_info.dump (dump_file, "   ");
+ fprintf (dump_file, "  curr_info: ");
+ curr_info.dump (dump_file, "   ");
+   }
+ block_inf

[PATCH V3 08/11] RISC-V: P8: Refactor emit-vsetvl phase and delete post optimization

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): New.
(pre_vsetvl::cleaup): New.
(pre_vsetvl::remove_avl_operand): New.
(pre_vsetvl::remove_unused_dest_operand): New.
(pass_vsetvl::get_vsetvl_at_end): Removed.
(local_avl_compatible_p): Removed.
(pass_vsetvl::local_eliminate_vsetvl_insn): Removed.
(get_first_vsetvl_before_rvv_insns): Removed.
(pass_vsetvl::global_eliminate_vsetvl_insn): Removed.
(pass_vsetvl::ssa_post_optimization): Removed.
(has_no_uses): Removed.
(pass_vsetvl::df_post_optimization): Removed.
(pass_vsetvl::init): Removed.
(pass_vsetvl::done): Removed.
(pass_vsetvl::compute_probabilities): Removed.
(pass_vsetvl::lazy_vsetvl): Removed.
(pass_vsetvl::execute): Removed.
(make_pass_vsetvl): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 878 +++
 1 file changed, 203 insertions(+), 675 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 855edd6d0f5..06d02d25cb3 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -3601,6 +3601,209 @@ pre_vsetvl::pre_global_vsetvl_info ()
 }
 }
 
+void
+pre_vsetvl::emit_vsetvl ()
+{
+  bool need_commit = false;
+
+  for (const bb_info *bb : crtl->ssa->bbs ())
+{
+  for (const auto &curr_info : get_block_info (bb).infos)
+   {
+ insn_info *insn = curr_info.get_insn ();
+ if (curr_info.delete_p ())
+   {
+ if (vsetvl_insn_p (insn->rtl ()))
+   remove_vsetvl_insn (curr_info);
+ continue;
+   }
+ else if (curr_info.valid_p ())
+   {
+ if (vsetvl_insn_p (insn->rtl ()))
+   {
+ const vsetvl_info temp = vsetvl_info (insn);
+ if (!(curr_info == temp))
+   {
+ if (dump_file)
+   {
+ fprintf (dump_file, "\n  Change vsetvl info from: ");
+ temp.dump (dump_file, "");
+ fprintf (dump_file, "  to: ");
+ curr_info.dump (dump_file, "");
+   }
+ change_vsetvl_insn (curr_info);
+   }
+   }
+ else
+   {
+ if (dump_file)
+   {
+ fprintf (dump_file,
+  "\n  Insert vsetvl info before insn %d: ",
+  insn->uid ());
+ curr_info.dump (dump_file, "");
+   }
+ insert_vsetvl_insn (EMIT_BEFORE, curr_info);
+   }
+   }
+   }
+}
+
+  for (const vsetvl_info &item : m_delete_list)
+{
+  gcc_assert (vsetvl_insn_p (item.get_insn ()->rtl ()));
+  remove_vsetvl_insn (item);
+}
+
+  /* m_insert vsetvl as LCM suggest. */
+  for (int ed = 0; ed < NUM_EDGES (m_edges); ed++)
+{
+  edge eg = INDEX_EDGE (m_edges, ed);
+  sbitmap i = m_insert[ed];
+  if (bitmap_count_bits (i) < 1)
+   continue;
+
+  if (bitmap_count_bits (i) > 1)
+   /* For code with infinite loop (e.g. pr61634.c), The data flow is
+  completely wrong.  */
+   continue;
+
+  gcc_assert (bitmap_count_bits (i) == 1);
+  unsigned expr_index = bitmap_first_set_bit (i);
+  const vsetvl_info &info = *m_exprs[expr_index];
+  gcc_assert (info.valid_p ());
+  if (dump_file)
+   {
+ fprintf (dump_file,
+  "\n  Insert vsetvl info at edge(bb %u -> bb %u): ",
+  eg->src->index, eg->dest->index);
+ info.dump (dump_file, "");
+   }
+  rtl_profile_for_edge (eg);
+  start_sequence ();
+
+  insert_vsetvl_insn (EMIT_DIRECT, info);
+  rtx_insn *rinsn = get_insns ();
+  end_sequence ();
+  default_rtl_profile ();
+
+  /* We should not get an abnormal edge here.  */
+  gcc_assert (!(eg->flags & EDGE_ABNORMAL));
+  need_commit = true;
+  insert_insn_on_edge (rinsn, eg);
+}
+
+  /* Insert vsetvl info that was not deleted after lift up.  */
+  for (const bb_info *bb : crtl->ssa->bbs ())
+{
+  const vsetvl_block_info &block_info = get_block_info (bb);
+  if (!block_info.has_info ())
+   continue;
+
+  const vsetvl_info &footer_info = block_info.get_exit_info ();
+
+  if (footer_info.delete_p ())
+   continue;
+
+  edge eg;
+  edge_iterator eg_iterator;
+  FOR_EACH_EDGE (eg, eg_iterator, bb->cfg_bb ()->succs)
+   {
+ gcc_assert (!(eg->flags & EDGE_ABNORMAL));
+ if (dump_file)
+   {
+ fprintf (
+   dump_file,
+   "\n  Insert missed vsetvl info at edge(bb %u -> bb %u): ",
+   eg->src->index, eg->dest->index);
+   

[PATCH V3 09/11] RISC-V: P9: Cleanup and reorganize helper functions

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (debug): Removed.
(bitmap_union_of_preds_with_entry): New.
(compute_reaching_defintion): New.
(vlmax_avl_p): New.
(enum vsetvl_type): Moved.
(enum emit_type): Moved.
(vlmul_to_str): Moved.
(vlmax_avl_insn_p): Moved.
(policy_to_str): Moved.
(loop_basic_block_p): Removed.
(valid_sew_p): Removed.
(vsetvl_insn_p): Moved.
(vsetvl_vtype_change_only_p): Removed.
(after_or_same_p): Removed.
(before_p): Removed.
(anticipatable_occurrence_p): Removed.
(available_occurrence_p): Removed.
(insn_should_be_added_p): Moved.
(get_all_sets): Moved.
(get_same_bb_set): Moved.
(gen_vsetvl_pat): Removed.
(emit_vsetvl_insn): Removed.
(eliminate_insn): Removed.
(calculate_vlmul): Moved.
(insert_vsetvl): Removed.
(get_max_int_sew): New.
(get_vl_vtype_info): Removed.
(get_max_float_sew): New.
(count_regno_occurrences): Moved.
(enum def_type): Moved.
(validate_change_or_fail): Moved.
(change_insn): Removed.
(get_all_real_uses): New.
(get_forward_read_vl_insn): Removed.
(get_backward_fault_first_load_insn): Removed.
(change_vsetvl_insn): Removed.
(avl_source_has_vsetvl_p): Moved.
(source_equal_p): Moved.
(same_equiv_note_p): Moved.
(calculate_sew): Moved.
(get_expr_id): New.
(get_regno): New.
(get_bb_index): New.
(has_no_uses): Moved.

---
 gcc/config/riscv/riscv-vsetvl.cc | 1153 ++
 1 file changed, 383 insertions(+), 770 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 06d02d25cb3..e136351aee5 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -18,60 +18,47 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
-/*  This pass is to Set VL/VTYPE global status for RVV instructions
-that depend on VL and VTYPE registers by Lazy code motion (LCM).
-
-Strategy:
-
--  Backward demanded info fusion within block.
-
--  Lazy code motion (LCM) based demanded info backward propagation.
-
--  RTL_SSA framework for def-use, PHI analysis.
-
--  Lazy code motion (LCM) for global VL/VTYPE optimization.
-
-Assumption:
-
--  Each avl operand is either an immediate (must be in range 0 ~ 31) or 
reg.
-
-This pass consists of 5 phases:
-
--  Phase 1 - compute VL/VTYPE demanded information within each block
-   by backward data-flow analysis.
-
--  Phase 2 - Emit vsetvl instructions within each basic block according to
-   demand, compute and save ANTLOC && AVLOC of each block.
-
--  Phase 3 - LCM Earliest-edge baseed VSETVL demand fusion.
-
--  Phase 4 - Lazy code motion including: compute local properties,
-   pre_edge_lcm and vsetvl insertion && delete edges for LCM results.
-
--  Phase 5 - Cleanup AVL operand of RVV instruction since it will not be
-   used any more and VL operand of VSETVL instruction if it is not used by
-   any non-debug instructions.
-
--  Phase 6 - DF based post VSETVL optimizations.
-
-Implementation:
-
--  The subroutine of optimize == 0 is simple_vsetvl.
-   This function simplily vsetvl insertion for each RVV
-   instruction. No optimization.
-
--  The subroutine of optimize > 0 is lazy_vsetvl.
-   This function optimize vsetvl insertion process by
-   lazy code motion (LCM) layering on RTL_SSA.
-
--  get_avl (), get_insn (), get_avl_source ():
-
-   1. get_insn () is the current instruction, find_access (get_insn
-   ())->def is the same as get_avl_source () if get_insn () demand VL.
-   2. If get_avl () is non-VLMAX REG, get_avl () == get_avl_source
-   ()->regno ().
-   3. get_avl_source ()->regno () is the REGNO that we backward propagate.
- */
+/* The values of the vl and vtype registers will affect the behavior of RVV
+   insns. That is, when we need to execute an RVV instruction, we need to set
+   the correct vl and vtype values by executing the vsetvl instruction before.
+   Executing the fewest number of vsetvl instructions while keeping the 
behavior
+   the same is the problem this pass is trying to solve. This vsetvl pass is
+   divided into 5 phases:
+
+ - Phase 1 (fuse local vsetvl infos): traverses each Basic Block, parses
+   each instruction in it that affects vl and vtype state and generates an
+   array of vsetvl_info objects. Then traverse the vsetvl_info array from
+   front to back and perform fusion according to the fusion rules. The 
fused
+   vsetvl infos are stored in the vsetvl_block_info object's `infos` field.
+
+ - Phase 2 (earliest fuse g

[PATCH V3 07/11] RISC-V: P7: Move earliest fuse and lcm code to pre_vsetvl class

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::earliest_fuse_vsetvl_info): 
New.
(pre_vsetvl::pre_global_vsetvl_info): New.
(pass_vsetvl::prune_expressions): Removed.
(pass_vsetvl::compute_local_properties): Removed.
(pass_vsetvl::earliest_fusion): Removed.
(pass_vsetvl::vsetvl_fusion): Removed.
(pass_vsetvl::can_refine_vsetvl_p): Removed.
(pass_vsetvl::refine_vsetvls): Removed.
(pass_vsetvl::cleanup_vsetvls): Removed.
(pass_vsetvl::commit_vsetvls): Removed.
(pass_vsetvl::pre_vsetvl): Removed.

---
 gcc/config/riscv/riscv-vsetvl.cc | 1004 +++---
 1 file changed, 361 insertions(+), 643 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 27d47d7c039..855edd6d0f5 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2721,7 +2721,6 @@ public:
   }
 };
 
-
 void
 pre_vsetvl::compute_avl_def_data ()
 {
@@ -3241,6 +3240,367 @@ pre_vsetvl::fuse_local_vsetvl_info ()
 }
 
 
+bool
+pre_vsetvl::earliest_fuse_vsetvl_info ()
+{
+  compute_avl_def_data ();
+  compute_vsetvl_def_data ();
+  compute_lcm_local_properties ();
+
+  unsigned num_exprs = m_exprs.length ();
+  struct edge_list *m_edges = create_edge_list ();
+  unsigned num_edges = NUM_EDGES (m_edges);
+  sbitmap *antin
+= sbitmap_vector_alloc (last_basic_block_for_fn (cfun), num_exprs);
+  sbitmap *antout
+= sbitmap_vector_alloc (last_basic_block_for_fn (cfun), num_exprs);
+
+  sbitmap *earliest = sbitmap_vector_alloc (num_edges, num_exprs);
+
+  compute_available (m_avloc, m_kill, m_avout, m_avin);
+  compute_antinout_edge (m_antloc, m_transp, antin, antout);
+  compute_earliest (m_edges, num_exprs, antin, antout, m_avout, m_kill,
+   earliest);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file, "\n  Compute LCM earliest insert data:\n\n");
+  fprintf (dump_file, "Expression List (%u):\n", num_exprs);
+  for (unsigned i = 0; i < num_exprs; i++)
+   {
+ const auto &info = *m_exprs[i];
+ fprintf (dump_file, "  Expr[%u]: ", i);
+ info.dump (dump_file, "");
+   }
+  fprintf (dump_file, "\nbitmap data:\n");
+  for (const bb_info *bb : crtl->ssa->bbs ())
+   {
+ unsigned int i = bb->index ();
+ fprintf (dump_file, "  BB %u:\n", i);
+ fprintf (dump_file, "avloc: ");
+ dump_bitmap_file (dump_file, m_avloc[i]);
+ fprintf (dump_file, "kill: ");
+ dump_bitmap_file (dump_file, m_kill[i]);
+ fprintf (dump_file, "antloc: ");
+ dump_bitmap_file (dump_file, m_antloc[i]);
+ fprintf (dump_file, "transp: ");
+ dump_bitmap_file (dump_file, m_transp[i]);
+
+ fprintf (dump_file, "avin: ");
+ dump_bitmap_file (dump_file, m_avin[i]);
+ fprintf (dump_file, "avout: ");
+ dump_bitmap_file (dump_file, m_avout[i]);
+ fprintf (dump_file, "antin: ");
+ dump_bitmap_file (dump_file, antin[i]);
+ fprintf (dump_file, "antout: ");
+ dump_bitmap_file (dump_file, antout[i]);
+   }
+  fprintf (dump_file, "\n");
+  fprintf (dump_file, "  earliest:\n");
+  for (unsigned ed = 0; ed < num_edges; ed++)
+   {
+ edge eg = INDEX_EDGE (m_edges, ed);
+
+ if (bitmap_empty_p (earliest[ed]))
+   continue;
+ fprintf (dump_file, "Edge(bb %u -> bb %u): ", eg->src->index,
+  eg->dest->index);
+ dump_bitmap_file (dump_file, earliest[ed]);
+   }
+  fprintf (dump_file, "\n");
+}
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file, "Fused global info result:\n");
+}
+
+  bool changed = false;
+  for (unsigned ed = 0; ed < num_edges; ed++)
+{
+  sbitmap e = earliest[ed];
+  if (bitmap_empty_p (e))
+   continue;
+
+  unsigned int expr_index;
+  sbitmap_iterator sbi;
+  EXECUTE_IF_SET_IN_BITMAP (e, 0, expr_index, sbi)
+   {
+ vsetvl_info &curr_info = *m_exprs[expr_index];
+ if (!curr_info.valid_p ())
+   continue;
+
+ edge eg = INDEX_EDGE (m_edges, ed);
+ if (eg->probability == profile_probability::never ())
+   continue;
+ if (eg->src == ENTRY_BLOCK_PTR_FOR_FN (cfun)
+ || eg->dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
+   continue;
+
+ vsetvl_block_info &src_block_info = get_block_info (eg->src);
+ vsetvl_block_info &dest_block_info = get_block_info (eg->dest);
+
+ if (src_block_info.probability
+ == profile_probability::uninitialized ())
+   continue;
+
+ if (src_block_info.empty_p ())
+   {
+ vsetvl_info new_curr_info = curr_info;
+ new_curr_info.set_bb (crtl-

[PATCH V3 06/11] RISC-V: P6: Add computing reaching definition data flow

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::compute_avl_def_data): New.
(pre_vsetvl::compute_vsetvl_def_data): New.
(pre_vsetvl::compute_lcm_local_properties): New.

---
 gcc/config/riscv/riscv-vsetvl.cc | 395 +++
 1 file changed, 395 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index dad3d7c941e..27d47d7c039 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2722,6 +2722,401 @@ public:
 };
 
 
+void
+pre_vsetvl::compute_avl_def_data ()
+{
+  if (bitmap_empty_p (m_avl_regs))
+return;
+
+  unsigned num_regs = GP_REG_LAST + 1;
+  unsigned num_bbs = last_basic_block_for_fn (cfun);
+
+  sbitmap *avl_def_loc_temp = sbitmap_vector_alloc (num_bbs, num_regs);
+  for (const bb_info *bb : crtl->ssa->bbs ())
+{
+  bitmap_and (avl_def_loc_temp[bb->index ()], m_avl_regs,
+ m_reg_def_loc[bb->index ()]);
+
+  vsetvl_block_info &block_info = get_block_info (bb);
+  if (block_info.has_info ())
+   {
+ vsetvl_info &footer_info = block_info.get_exit_info ();
+ gcc_assert (footer_info.valid_p ());
+ if (footer_info.has_vl ())
+   bitmap_set_bit (avl_def_loc_temp[bb->index ()],
+   REGNO (footer_info.get_vl ()));
+   }
+}
+
+  if (m_avl_def_in)
+sbitmap_vector_free (m_avl_def_in);
+  if (m_avl_def_out)
+sbitmap_vector_free (m_avl_def_out);
+
+  unsigned num_exprs = num_bbs * num_regs;
+  sbitmap *avl_def_loc = sbitmap_vector_alloc (num_bbs, num_exprs);
+  sbitmap *m_kill = sbitmap_vector_alloc (num_bbs, num_exprs);
+  m_avl_def_in = sbitmap_vector_alloc (num_bbs, num_exprs);
+  m_avl_def_out = sbitmap_vector_alloc (num_bbs, num_exprs);
+
+  bitmap_vector_clear (avl_def_loc, num_bbs);
+  bitmap_vector_clear (m_kill, num_bbs);
+  bitmap_vector_clear (m_avl_def_out, num_bbs);
+
+  unsigned regno;
+  sbitmap_iterator sbi;
+  for (const bb_info *bb : crtl->ssa->bbs ())
+EXECUTE_IF_SET_IN_BITMAP (avl_def_loc_temp[bb->index ()], 0, regno, sbi)
+  {
+   bitmap_set_bit (avl_def_loc[bb->index ()],
+   get_expr_id (bb->index (), regno, num_bbs));
+   bitmap_set_range (m_kill[bb->index ()], regno * num_bbs, num_bbs);
+  }
+
+  basic_block entry = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+  EXECUTE_IF_SET_IN_BITMAP (m_avl_regs, 0, regno, sbi)
+bitmap_set_bit (m_avl_def_out[entry->index],
+   get_expr_id (entry->index, regno, num_bbs));
+
+  compute_reaching_defintion (avl_def_loc, m_kill, m_avl_def_in, 
m_avl_def_out);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file,
+  "  Compute avl reaching defition data (num_bbs %d, num_regs "
+  "%d):\n\n",
+  num_bbs, num_regs);
+  fprintf (dump_file, "avl_regs: ");
+  dump_bitmap_file (dump_file, m_avl_regs);
+  fprintf (dump_file, "\nbitmap data:\n");
+  for (const bb_info *bb : crtl->ssa->bbs ())
+   {
+ unsigned int i = bb->index ();
+ fprintf (dump_file, "  BB %u:\n", i);
+ fprintf (dump_file, "avl_def_loc:");
+ unsigned expr_id;
+ sbitmap_iterator sbi;
+ EXECUTE_IF_SET_IN_BITMAP (avl_def_loc[i], 0, expr_id, sbi)
+   {
+ fprintf (dump_file, " (r%u,bb%u)", get_regno (expr_id, num_bbs),
+  get_bb_index (expr_id, num_bbs));
+   }
+ fprintf (dump_file, "\nkill:");
+ EXECUTE_IF_SET_IN_BITMAP (m_kill[i], 0, expr_id, sbi)
+   {
+ fprintf (dump_file, " (r%u,bb%u)", get_regno (expr_id, num_bbs),
+  get_bb_index (expr_id, num_bbs));
+   }
+ fprintf (dump_file, "\navl_def_in:");
+ EXECUTE_IF_SET_IN_BITMAP (m_avl_def_in[i], 0, expr_id, sbi)
+   {
+ fprintf (dump_file, " (r%u,bb%u)", get_regno (expr_id, num_bbs),
+  get_bb_index (expr_id, num_bbs));
+   }
+ fprintf (dump_file, "\navl_def_out:");
+ EXECUTE_IF_SET_IN_BITMAP (m_avl_def_out[i], 0, expr_id, sbi)
+   {
+ fprintf (dump_file, " (r%u,bb%u)", get_regno (expr_id, num_bbs),
+  get_bb_index (expr_id, num_bbs));
+   }
+ fprintf (dump_file, "\n");
+   }
+}
+
+  sbitmap_vector_free (avl_def_loc);
+  sbitmap_vector_free (m_kill);
+  sbitmap_vector_free (avl_def_loc_temp);
+
+  m_dem.set_avl_in_out_data (m_avl_def_in, m_avl_def_out);
+}
+
+void
+pre_vsetvl::compute_vsetvl_def_data ()
+{
+  m_vsetvl_def_exprs.truncate (0);
+  add_expr (m_vsetvl_def_exprs, m_unknow_info);
+  for (const bb_info *bb : crtl->ssa->bbs ())
+{
+  vsetvl_block_info &block_info = get_block_info (bb);
+  if (block_info.empty_p ())
+   continue;
+  vsetvl_info &footer_info = block_info.get_exit_info ();
+  gcc_assert (fo

[PATCH V3 11/11] RISC-V: P11: Adjust and add testcases

2023-10-19 Thread Lehua Ding
PR target/111037
PR target/111234
PR target/111725

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalar_move-1.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-46.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-95.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/pr109743-2.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/pr109773-1.c: Adjust.
* gcc.target/riscv/rvv/base/pr111037-1.c: Moved to...
* gcc.target/riscv/rvv/vsetvl/pr111037-1.c: ...here.
* gcc.target/riscv/rvv/base/pr111037-2.c: Moved to...
* gcc.target/riscv/rvv/vsetvl/pr111037-2.c: ...here.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vsetvl-13.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vsetvl-18.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/avl_single-104.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-105.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-106.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-107.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-108.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-109.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr111037-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr111037-4.c: New test.

---
 .../gcc.target/riscv/rvv/base/scalar_move-1.c |  2 +-
 .../riscv/rvv/vsetvl/avl_single-104.c | 35 +++
 .../riscv/rvv/vsetvl/avl_single-105.c | 23 ++
 .../riscv/rvv/vsetvl/avl_single-106.c | 34 ++
 .../riscv/rvv/vsetvl/avl_single-107.c | 41 +
 .../riscv/rvv/vsetvl/avl_single-108.c | 41 +
 .../riscv/rvv/vsetvl/avl_single-109.c | 45 +++
 .../riscv/rvv/vsetvl/avl_single-23.c  |  7 +--
 .../riscv/rvv/vsetvl/avl_single-46.c  |  3 +-
 .../riscv/rvv/vsetvl/avl_single-84.c  |  5 +--
 .../riscv/rvv/vsetvl/avl_single-89.c  |  8 ++--
 .../riscv/rvv/vsetvl/avl_single-95.c  |  2 +-
 .../riscv/rvv/vsetvl/imm_bb_prop-1.c  |  7 +--
 .../gcc.target/riscv/rvv/vsetvl/pr109743-2.c  |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/pr109773-1.c  |  2 +-
 .../riscv/rvv/{base => vsetvl}/pr111037-1.c   |  0
 .../riscv/rvv/{base => vsetvl}/pr111037-2.c   |  0
 .../gcc.target/riscv/rvv/vsetvl/pr111037-3.c  | 16 +++
 .../gcc.target/riscv/rvv/vsetvl/pr111037-4.c  | 16 +++
 .../riscv/rvv/vsetvl/vlmax_back_prop-25.c | 10 ++---
 .../riscv/rvv/vsetvl/vlmax_back_prop-26.c | 10 ++---
 .../riscv/rvv/vsetvl/vlmax_conflict-12.c  |  1 -
 .../riscv/rvv/vsetvl/vlmax_conflict-3.c   |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-13.c   |  4 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-18.c   |  4 +-
 .../gcc.target/riscv/rvv/vsetvl/vsetvl-23.c   |  2 +-
 26 files changed, 288 insertions(+), 34 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-104.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-105.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-106.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-107.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-108.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-109.c
 rename gcc/testsuite/gcc.target/riscv/rvv/{base => vsetvl}/pr111037-1.c (100%)
 rename gcc/testsuite/gcc.target/riscv/rvv/{base => vsetvl}/pr111037-2.c (100%)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-4.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-1.c
index 18349132a88..c833d8989e9 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-1.c
@@ -46,8 +46,8 @@ int32_t foo3 (int32_t *base, size_t vl)
 ** vl1re32\.v\tv[0-9]+,0\([a-x0-9]+\)
 ** vsetvli\tzero,[a-x0-9]+,e32,m1,t[au],m[au]
 ** vadd.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
-** vsetvli\tzero,[a-x0-9]+,e32,m2,t[au],m[au]
 ** vmv.x.s\t[a-x0-9]+,\s*v[0-9]+
+** vsetvli\tzero,[a-x0-9]+,e32,m2,t[au],m[au]
 ** vmv.v.x\tv[0-9]+,\s*[a-x0-9]+
 ** vmv.x.s\t[a-x0-9]+,\s*v[0-9]+
 ** ret
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-104.c 
b/gc

[PATCH V3 10/11] RISC-V: P10: Delete riscv-vsetvl.h and adjust riscv-vsetvl.def

2023-10-19 Thread Lehua Ding
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Removed.
(DEF_SEW_LMUL_RULE): New.
(DEF_SEW_LMUL_FUSE_RULE): Removed.
(DEF_POLICY_RULE): New.
(DEF_UNAVAILABLE_COND): Removed.
(DEF_AVL_RULE): New.
(sew_lmul): New.
(ratio_only): New.
(sew_only): New.
(ge_sew): New.
(ratio_and_ge_sew): New.
(tail_mask_policy): New.
(tail_policy_only): New.
(mask_policy_only): New.
(ignore_policy): New.
(avl): New.
(non_zero_avl): New.
(ignore_avl): New.
* config/riscv/t-riscv: Removed.
* config/riscv/riscv-vsetvl.h: Removed.

---
 gcc/config/riscv/riscv-vsetvl.def | 641 +++---
 gcc/config/riscv/riscv-vsetvl.h   | 488 ---
 gcc/config/riscv/t-riscv  |   2 +-
 3 files changed, 155 insertions(+), 976 deletions(-)
 delete mode 100644 gcc/config/riscv/riscv-vsetvl.h

diff --git a/gcc/config/riscv/riscv-vsetvl.def 
b/gcc/config/riscv/riscv-vsetvl.def
index 709cc4ee0df..401d2c6f421 100644
--- a/gcc/config/riscv/riscv-vsetvl.def
+++ b/gcc/config/riscv/riscv-vsetvl.def
@@ -18,496 +18,163 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
-#ifndef DEF_INCOMPATIBLE_COND
-#define DEF_INCOMPATIBLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, 
\
- GE_SEW1, TAIL_POLICTY1, MASK_POLICY1, AVL2,  \
- SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2,  \
- TAIL_POLICTY2, MASK_POLICY2, COND)
+/* DEF_XXX_RULE (prev_demand, next_demand, fused_demand, compatible_p,
+   available_p, fuse)
+   prev_demand: the prev vector insn's sew_lmul_type
+   next_demand: the next vector insn's sew_lmul_type
+   fused_demand: if them are compatible, change prev_info demand to the
+fused_demand after fuse prev_info and next_info
+   compatible_p: check if prev_demand and next_demand are compatible
+   available_p: check if prev_demand is available for next_demand
+   fuse: if them are compatible, how to modify prev_info  */
+
+#ifndef DEF_SEW_LMUL_RULE
+#define DEF_SEW_LMUL_RULE(prev_demand, next_demand, fused_demand,  
\
+ compatible_p, available_p, fuse)
 #endif
 
-#ifndef DEF_SEW_LMUL_FUSE_RULE
-#define DEF_SEW_LMUL_FUSE_RULE(DEMAND_SEW1, DEMAND_LMUL1, DEMAND_RATIO1,   
\
-  DEMAND_GE_SEW1, DEMAND_SEW2, DEMAND_LMUL2,  \
-  DEMAND_RATIO2, DEMAND_GE_SEW2, NEW_DEMAND_SEW,  \
-  NEW_DEMAND_LMUL, NEW_DEMAND_RATIO,  \
-  NEW_DEMAND_GE_SEW, NEW_SEW, NEW_VLMUL,  \
-  NEW_RATIO)
+#ifndef DEF_POLICY_RULE
+#define DEF_POLICY_RULE(prev_demand, next_demand, fused_demand, compatible_p,  
\
+   available_p, fuse)
 #endif
 
-#ifndef DEF_UNAVAILABLE_COND
-#define DEF_UNAVAILABLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, 
\
-TAIL_POLICTY1, MASK_POLICY1, AVL2, SEW2, LMUL2,   \
-RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \
-MASK_POLICY2, COND)
+#ifndef DEF_AVL_RULE
+#define DEF_AVL_RULE(prev_demand, next_demand, fused_demand, compatible_p, 
\
+available_p, fuse)
 #endif
 
-/* Case 1: Demand compatible AVL.  */
-DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY,
-  /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY,
-  /*NONZERO_AVL*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_ANY,
-  /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY,
-  /*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY,
-  /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY,
-  /*NONZERO_AVL*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_ANY,
-  /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY,
-  /*COND*/ incompatible_avl_p)
-
-/* Case 2: Demand same SEW.  */
-DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE,
-  /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY,
-  /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE,
-  /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY,
-  /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE,
-  /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY,
-  /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE,
-  /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY,
-  /*COND*/ different_sew_p)
-
-/* Case 3: Demand same LMUL.  */
-DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_A

Re: [PATCH V2 00/14] Refactor and cleanup vsetvl pass

2023-10-19 Thread Lehua Ding
_single-48.c   -O2 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-48.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-48.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O1 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O2 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-49.c   -Os 
scan-assembler-times vsetvli 2
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-65.c   -O2 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-65.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-65.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O0 scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O1 scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O2 scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-not vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-not vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -O3 -g 
scan-assembler-not vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/pr111037-2.c   -Os scan-assembler-not 
vsetvli
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-10.c   -O3 -g 
scan-assembler-times vsetvli 15
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-11.c   -O3 -g 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-12.c   -O3 -g 
scan-assembler-times vsetvli 9
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-13.c   -O3 -g 
scan-assembler-times vsetvli 9
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-14.c   -O3 -g 
scan-assembler-times vsetvli 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-15.c   -O3 -g 
scan-assembler-times vsetvli 4
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-16.c   -O3 -g 
scan-assembler-times vsetvli 15
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-18.c   -O3 -g 
scan-assembler-times vsetvli 3
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-19.c   -O3 -g 
scan-assembler-times vsetvli 15
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_single_block-9.c   -O3 -g 
scan-assembler-times vsetvli 15


On 10/17/23 04:34, Lehua Ding wrote:
This patch refactors and cleanups the vsetvl pass in order to make the 
code

easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 
only maintain
    and modify this virtual CFG. Phase 4 performs insertion, 
modification and
    deletion of vsetvl insns based on the virtual CFG. The Basic block 
in the
    virtual CFG is called vsetvl_block_info and the vsetvl information 
inside

    is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand 
system,

    this Phase only fuse local vsetvl info in forward direction.
3. Refactor Phase 3, change the logic for determining whether to 
uplift vsetvl
    info to a pred basic block to a more unified method that there is 
a vsetvl

    info in the vsetvl defintion reaching in compatible with it.
4. Place all modification operations to the RTL in Phase 4 and Phase 5.
    Phase 4 is responsible for inserting, modifying and deleting vsetvl
    instructions based on fully optimized vsetvl infos. Phase 5 
removes the avl

    operand from the RVV instruction and removes the unused dest operand
    register from the vsetvl insns.

These modifications resulted in some testcases needing to be updated. 
The reasons

for updating are summarized below:

1. more optimized
    vlmax_back_prop-25.c/vlmax_back_prop-26.c/vlmax_conflict-3.c/
    vlmax_conflict-12.c/vsetvl-13.c/vsetvl-23.c/
    avl_single-23.c/avl_single-89.c/avl_single-95.c/pr109773-1.c
2. less unnecessary fusion
    avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c
3. local fuse direction (backward -> forward)
    scalar_move-1.c/
4. add some bugfix testcases.
    pr111037-3.c/pr111037-4.c
    avl_single-89.c

PR target/111037
    PR target/111234
PR target/111725


Lehua Ding (14):
   RISC-V: P1: Refactor avl_info/vl_vtype_info/vector_insn_info
   RISC-V: P2: Refactor and cleanup demand system
   RISC-V: P3: Refactor class vector_infos_manager to pre_vsetvl
   RISC-V: P4: move method from class pass_

Re: [PATCH V3 00/11] Refactor and cleanup vsetvl pass

2023-10-19 Thread Lehua Ding

Okay, thanks anyway.

On 2023/10/19 16:38, Robin Dapp wrote:

Hi Lehua,

thanks for the extensive rework.  I'm going to let Juzhe handle the review
since it's his pass and he knows it best.  Delegated it to him in patchwork.

Regards
  Robin



--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai


[PATCH] Add more ForEachMacros to clang-format file

2023-06-02 Thread Lehua Ding
Hi,

This patch adds some missed ForEachMacros to the contrib/clang-format file,
which allows the clang-format tool to format gcc code correctly.

Best,
Lehua

---
 contrib/clang-format | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/contrib/clang-format b/contrib/clang-format
index 5d264aee3c6..8cfee99cd15 100644
--- a/contrib/clang-format
+++ b/contrib/clang-format
@@ -194,7 +194,15 @@ ForEachMacros: [
 'FOR_EACH_WIDER_MODE',
 'FOR_EXPR',
 'FOR_INIT_STMT',
-'FOR_SCOPE'
+'FOR_SCOPE',
+'EXECUTE_IF_SET_IN_BITMAP',
+'EXECUTE_IF_AND_IN_BITMAP',
+'EXECUTE_IF_AND_COMPL_IN_BITMAP',
+'EXECUTE_IF_SET_IN_REG_SET',
+'EXECUTE_IF_SET_IN_HARD_REG_SET',
+'EXECUTE_IF_AND_COMPL_IN_REG_SET',
+'EXECUTE_IF_AND_IN_REG_SET',
+'EXECUTE_IF_SET_IN_SPARSESET'
 ]
 IndentCaseLabels: false
 NamespaceIndentation: None
-- 
2.36.1



[PATCH] testsuite: fix the condition bug in tsvc s176

2023-06-08 Thread Lehua Ding
Hi,

This patch fixes the problem that the loop in the tsvc s176 function is
optimized and removed because `iterations/LEN_1D` is 0 (where iterations
is set to 1, LEN_1D is set to 32000 in tsvc.h).

This testcase passed on x86 and AArch64 system.

Best,
Lehua

gcc/testsuite/ChangeLog:

* gcc.dg/vect/tsvc/vect-tsvc-s176.c: adjust iterations

---
 gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c 
b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c
index 79faf7fdb9e4..365e5205982b 100644
--- a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c
+++ b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c
@@ -14,7 +14,7 @@ real_t s176(struct args_t * func_args)
 initialise_arrays(__func__);
 
 int m = LEN_1D/2;
-for (int nl = 0; nl < 4*(iterations/LEN_1D); nl++) {
+for (int nl = 0; nl < 4*(10*iterations/LEN_1D); nl++) {
 for (int j = 0; j < (LEN_1D/2); j++) {
 for (int i = 0; i < m; i++) {
 a[i] += b[i+m-j-1] * c[j];
@@ -39,4 +39,4 @@ int main (int argc, char **argv)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail *-*-* } } } 
*/
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */
-- 
2.36.1



Re: [PATCH] testsuite: fix the condition bug in tsvc s176

2023-06-09 Thread Lehua Ding
> It's odd that the checksum doesn't depend on the number of iterations done ...

This is because the difference between the calculated result (32063.902344) and
the expected result (32000.00) is small. The current check is that the 
result
is considered correct as long as the `value/expected` ratio is between 0.99f and
1.01f. I'm not sure if this check is enough, but I should also update the 
expected
result to 32063.902344 (the same without vectorized).

Best,
Lehua

gcc/testsuite/ChangeLog:

* gcc.dg/vect/tsvc/tsvc.h:
* gcc.dg/vect/tsvc/vect-tsvc-s176.c:

---
 gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h   | 2 +-
 gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h 
b/gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h
index cd39c041903d..d910c384fc83 100644
--- a/gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h
+++ b/gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h
@@ -1164,7 +1164,7 @@ real_t get_expected_result(const char * name)
 } else if (!strcmp(name, "s175")) {
return 32009.023438f;
 } else if (!strcmp(name, "s176")) {
-   return 32000.f;
+   return 32063.902344f;
 } else if (!strcmp(name, "s211")) {
return 63983.308594f;
 } else if (!strcmp(name, "s212")) {
diff --git a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c 
b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c
index 79faf7fdb9e4..365e5205982b 100644
--- a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c
+++ b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c
@@ -14,7 +14,7 @@ real_t s176(struct args_t * func_args)
 initialise_arrays(__func__);
 
 int m = LEN_1D/2;
-for (int nl = 0; nl < 4*(iterations/LEN_1D); nl++) {
+for (int nl = 0; nl < 4*(10*iterations/LEN_1D); nl++) {
 for (int j = 0; j < (LEN_1D/2); j++) {
 for (int i = 0; i < m; i++) {
 a[i] += b[i+m-j-1] * c[j];
@@ -39,4 +39,4 @@ int main (int argc, char **argv)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail *-*-* } } } 
*/
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */
-- 
2.36.1



Re: [PATCH] testsuite: fix the condition bug in tsvc s176

2023-06-09 Thread Lehua Ding
> I stitched together appropriate ChangeLog entries and pushed this to 
the 
> trunk (I don't think Lehua has write access).

Thank you!


Best,
Lehua

[PATCH] RISC-V: Remove duplicate `#include "riscv-vector-switch.def"`

2023-06-13 Thread Lehua Ding
Hi,

This patch remove the duplicate `#include "riscv-vector-switch.def"` statement
and add #undef for ENTRY and TUPLE_ENTRY macros later.

Best,
Lehua

---
 gcc/config/riscv/riscv-v.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index e1b85a5af91f..09c2abcbc623 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1210,7 +1210,6 @@ struct mode_vtype_group
   ratio_for_min_vlen64[MODE##mode] = RATIO_FOR_MIN_VLEN64; 
\
   vlmul_for_for_vlen128[MODE##mode] = VLMUL_FOR_MIN_VLEN128;   
\
   ratio_for_for_vlen128[MODE##mode] = RATIO_FOR_MIN_VLEN128;
-#include "riscv-vector-switch.def"
 #define TUPLE_ENTRY(MODE, REQUIREMENT, SUBPART_MODE, NF, VLMUL_FOR_MIN_VLEN32, 
\
RATIO_FOR_MIN_VLEN32, VLMUL_FOR_MIN_VLEN64,\
RATIO_FOR_MIN_VLEN64, VLMUL_FOR_MIN_VLEN128,   \
@@ -1224,6 +1223,8 @@ struct mode_vtype_group
   vlmul_for_for_vlen128[MODE##mode] = VLMUL_FOR_MIN_VLEN128;   
\
   ratio_for_for_vlen128[MODE##mode] = RATIO_FOR_MIN_VLEN128;
 #include "riscv-vector-switch.def"
+#undef ENTRY
+#undef TUPLE_ENTRY
   }
 };
 
-- 
2.36.3



[PATCH V2] RISC-V: Remove duplicate `#include "riscv-vector-switch.def"`

2023-06-13 Thread Lehua Ding
Hi,

This patch remove the duplicate `#include "riscv-vector-switch.def"` statement
and add #undef for ENTRY and TUPLE_ENTRY macros later.

Best,
Lehua

gcc/ChangeLog:

* config/riscv/riscv-v.cc (struct mode_vtype_group): Remove duplicate 
#include.
(ENTRY): Undef.
(TUPLE_ENTRY): Undef.

---
 gcc/config/riscv/riscv-v.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index e1b85a5af91f..09c2abcbc623 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1210,7 +1210,6 @@ struct mode_vtype_group
   ratio_for_min_vlen64[MODE##mode] = RATIO_FOR_MIN_VLEN64; 
\
   vlmul_for_for_vlen128[MODE##mode] = VLMUL_FOR_MIN_VLEN128;   
\
   ratio_for_for_vlen128[MODE##mode] = RATIO_FOR_MIN_VLEN128;
-#include "riscv-vector-switch.def"
 #define TUPLE_ENTRY(MODE, REQUIREMENT, SUBPART_MODE, NF, VLMUL_FOR_MIN_VLEN32, 
\
RATIO_FOR_MIN_VLEN32, VLMUL_FOR_MIN_VLEN64,\
RATIO_FOR_MIN_VLEN64, VLMUL_FOR_MIN_VLEN128,   \
@@ -1224,6 +1223,8 @@ struct mode_vtype_group
   vlmul_for_for_vlen128[MODE##mode] = VLMUL_FOR_MIN_VLEN128;   
\
   ratio_for_for_vlen128[MODE##mode] = RATIO_FOR_MIN_VLEN128;
 #include "riscv-vector-switch.def"
+#undef ENTRY
+#undef TUPLE_ENTRY
   }
 };
 
-- 
2.36.3



Re: [PATCH V2] RISC-V: Remove duplicate `#include "riscv-vector-switch.def"`

2023-06-13 Thread Lehua Ding
> LGTM. 
> Thanks.
> Will merge it soon.


Thank you for such a prompt reply.
 
  

[PATCH] RISC-V: Fix PR 110119

2023-06-14 Thread Lehua Ding
Hi,

This patch fix the PR 110119. 

The reason for this bug is that in the case where the vector register is set
to a fixed length (with `--param=riscv-autovec-preference=fixed-vlmax` option),
TARGET_PASS_BY_REFERENCE thinks that variables of type vint32m1 can be passed
through two scalar registers, but when GCC calls FUNCTION_VALUE (call function
riscv_get_arg_info inside) it returns NULL_RTX. These two functions are not
unified. The current treatment is to pass all vector arguments and returns
through the function stack, and a new calling convention for vector registers
will be added in the future.

Best,
Lehua

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_get_arg_info): Return NULL_RTX for 
vector mode
(riscv_pass_by_reference): Return true for vector mode

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/p110119-1.c: New test.
* gcc.target/riscv/rvv/base/p110119-2.c: New test.

---
 gcc/config/riscv/riscv.cc | 19 -
 .../gcc.target/riscv/rvv/base/p110119-1.c | 27 +++
 .../gcc.target/riscv/rvv/base/p110119-2.c | 27 +++
 3 files changed, 67 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/p110119-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/p110119-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dd5361c2bd2a..be868c7b6127 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3915,13 +3915,13 @@ riscv_get_arg_info (struct riscv_arg_info *info, const 
CUMULATIVE_ARGS *cum,
   riscv_pass_in_vector_p (type);
 }
 
-  /* TODO: Currently, it will cause an ICE for --param
- riscv-autovec-preference=fixed-vlmax. So, we just return NULL_RTX here
- let GCC generate loads/stores. Ideally, we should either warn the user not
- to use an RVV vector type as function argument or support the calling
- convention directly.  */
-  if (riscv_v_ext_mode_p (mode))
+  /* All current vector arguments and return values are passed through the
+ function stack. Ideally, we should either warn the user not to use an RVV
+ vector type as function argument or support a calling convention
+ with better performance.  */
+  if (riscv_v_ext_mode_p (mode) || riscv_v_ext_tuple_mode_p (mode))
 return NULL_RTX;
+
   if (named)
 {
   riscv_aggregate_field fields[2];
@@ -4106,6 +4106,13 @@ riscv_pass_by_reference (cumulative_args_t cum_v, const 
function_arg_info &arg)
return false;
 }
 
+  /* All current vector arguments and return values are passed through the
+ function stack. Ideally, we should either warn the user not to use an RVV
+ vector type as function argument or support a calling convention
+ with better performance.  */
+  if (riscv_v_ext_mode_p (arg.mode) || riscv_v_ext_tuple_mode_p (arg.mode))
+return true;
+
   /* Pass by reference if the data do not fit in two integer registers.  */
   return !IN_RANGE (size, 0, 2 * UNITS_PER_WORD);
 }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-1.c
new file mode 100644
index ..3583e06f1a8d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv --param=riscv-autovec-preference=fixed-vlmax" 
} */
+/* { dg-skip-if "test rvv intrinsic" { *-*-* } { "*" } { "-march=rv*v*" } } */
+
+#include "riscv_vector.h"
+
+typedef int8_t vnx2qi __attribute__ ((vector_size (2)));
+
+__attribute__ ((noipa)) vnx2qi
+f_vnx2qi (int8_t a, int8_t b, int8_t *out)
+{
+  vnx2qi v = {a, b};
+  return v;
+}
+
+__attribute__ ((noipa)) vnx2qi
+f_vnx2qi_2 (vnx2qi a, int8_t *out)
+{
+  return a;
+}
+
+__attribute__ ((noipa)) vint32m1_t
+f_vint32m1 (int8_t * a, int8_t *out)
+{
+  vint32m1_t v = *(vint32m1_t*)a;
+  return v;
+}
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-2.c
new file mode 100644
index ..1d12a610b677
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gczve32x 
--param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-skip-if "test rvv intrinsic" { *-*-* } { "*" } { "-march=rv*v*" } } */
+
+#include 
+#include "riscv_vector.h"
+
+__attribute__ ((noipa)) vint32m1x3_t
+foo1 (int32_t *in, int vl)
+{
+  vint32m1x3_t v = __riscv_vlseg3e32_v_i32m1x3 (in, vl);
+  return v;
+}
+
+__attribute__ ((noipa)) void
+foo2 (vint32m1x3_t a, int32_t *out, int vl)
+{
+  __riscv_vsseg3e32_v_i32m1x3 (out, a, vl);
+}
+
+__attribute__ ((noipa)) vint32m1x3_t
+foo3 (vint32m1x3_t a, int32_t *out, int32_t *in, int vl)
+{
+  __riscv_vsseg3e32_v_i32m1x3 (out, a, vl);
+  vint32m1x3_t v = __riscv_vlseg3e32_v_i32m1x3 (in, vl);
+  return v;
+}
-- 
2.36.3



[PATCH] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Lehua Ding
Hi,

The reason for this bug is that in the case where the vector register is set
to a fixed length (with `--param=riscv-autovec-preference=fixed-vlmax` option),
TARGET_PASS_BY_REFERENCE thinks that variables of type vint32m1 can be passed
through two scalar registers, but when GCC calls FUNCTION_VALUE (call function
riscv_get_arg_info inside) it returns NULL_RTX. These two functions are not
unified. The current treatment is to pass all vector arguments and returns
through the function stack, and a new calling convention for vector registers
will be added in the future.

Best,
Lehua

  PR target/110119

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_get_arg_info): Return NULL_RTX for 
vector mode
(riscv_pass_by_reference): Return true for vector mode

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/p110119-1.c: New test.
* gcc.target/riscv/rvv/base/p110119-2.c: New test.

---
 gcc/config/riscv/riscv.cc | 19 +-
 .../gcc.target/riscv/rvv/base/p110119-1.c | 26 +++
 .../gcc.target/riscv/rvv/base/p110119-2.c | 26 +++
 3 files changed, 65 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/p110119-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/p110119-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dd5361c2bd2a..be868c7b6127 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3915,13 +3915,13 @@ riscv_get_arg_info (struct riscv_arg_info *info, const 
CUMULATIVE_ARGS *cum,
   riscv_pass_in_vector_p (type);
 }
 
-  /* TODO: Currently, it will cause an ICE for --param
- riscv-autovec-preference=fixed-vlmax. So, we just return NULL_RTX here
- let GCC generate loads/stores. Ideally, we should either warn the user not
- to use an RVV vector type as function argument or support the calling
- convention directly.  */
-  if (riscv_v_ext_mode_p (mode))
+  /* All current vector arguments and return values are passed through the
+ function stack. Ideally, we should either warn the user not to use an RVV
+ vector type as function argument or support a calling convention
+ with better performance.  */
+  if (riscv_v_ext_mode_p (mode) || riscv_v_ext_tuple_mode_p (mode))
 return NULL_RTX;
+
   if (named)
 {
   riscv_aggregate_field fields[2];
@@ -4106,6 +4106,13 @@ riscv_pass_by_reference (cumulative_args_t cum_v, const 
function_arg_info &arg)
return false;
 }
 
+  /* All current vector arguments and return values are passed through the
+ function stack. Ideally, we should either warn the user not to use an RVV
+ vector type as function argument or support a calling convention
+ with better performance.  */
+  if (riscv_v_ext_mode_p (arg.mode) || riscv_v_ext_tuple_mode_p (arg.mode))
+return true;
+
   /* Pass by reference if the data do not fit in two integer registers.  */
   return !IN_RANGE (size, 0, 2 * UNITS_PER_WORD);
 }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-1.c
new file mode 100644
index ..0edbb0626299
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-1.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv --param=riscv-autovec-preference=fixed-vlmax" 
} */
+
+#include "riscv_vector.h"
+
+typedef int8_t vnx2qi __attribute__ ((vector_size (2)));
+
+__attribute__ ((noipa)) vnx2qi
+f_vnx2qi (int8_t a, int8_t b, int8_t *out)
+{
+  vnx2qi v = {a, b};
+  return v;
+}
+
+__attribute__ ((noipa)) vnx2qi
+f_vnx2qi_2 (vnx2qi a, int8_t *out)
+{
+  return a;
+}
+
+__attribute__ ((noipa)) vint32m1_t
+f_vint32m1 (int8_t * a, int8_t *out)
+{
+  vint32m1_t v = *(vint32m1_t*)a;
+  return v;
+}
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-2.c
new file mode 100644
index ..b233ff1e9040
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/p110119-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gczve32x 
--param=riscv-autovec-preference=fixed-vlmax" } */
+
+#include 
+#include "riscv_vector.h"
+
+__attribute__ ((noipa)) vint32m1x3_t
+foo1 (int32_t *in, int vl)
+{
+  vint32m1x3_t v = __riscv_vlseg3e32_v_i32m1x3 (in, vl);
+  return v;
+}
+
+__attribute__ ((noipa)) void
+foo2 (vint32m1x3_t a, int32_t *out, int vl)
+{
+  __riscv_vsseg3e32_v_i32m1x3 (out, a, vl);
+}
+
+__attribute__ ((noipa)) vint32m1x3_t
+foo3 (vint32m1x3_t a, int32_t *out, int32_t *in, int vl)
+{
+  __riscv_vsseg3e32_v_i32m1x3 (out, a, vl);
+  vint32m1x3_t v = __riscv_vlseg3e32_v_i32m1x3 (in, vl);
+  return v;
+}
-- 
2.36.3



Re: [PATCH] RISC-V: Fix PR 110119

2023-06-14 Thread Lehua Ding
Resubmitted a new, more standardized patch(bellow is the new patch link), 
thanks.


https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621683.html


  

Re: [PATCH] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Lehua Ding
Fix all comment from Juzhe, thanks. Below is the new patch. Please use the
attachment if there is a problem with the format of the patch below.



PR 110119



gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_get_arg_info): Return NULL_RTX for 
vector mode
(riscv_pass_by_reference): Return true for vector mode




gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr110119-1.c: New test.
* gcc.target/riscv/rvv/base/pr110119-2.c: New test.



---
 gcc/config/riscv/riscv.cc | 17 
 .../gcc.target/riscv/rvv/base/pr110119-1.c| 26 +++
 .../gcc.target/riscv/rvv/base/pr110119-2.c| 26 +++
 3 files changed, 64 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dd5361c2bd2a..e5ae4e81b7a5 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3915,13 +3915,13 @@ riscv_get_arg_info (struct riscv_arg_info *info, const 
CUMULATIVE_ARGS *cum,
   riscv_pass_in_vector_p (type);
 }
 
-  /* TODO: Currently, it will cause an ICE for --param
- riscv-autovec-preference=fixed-vlmax. So, we just return NULL_RTX here
- let GCC generate loads/stores. Ideally, we should either warn the user not
- to use an RVV vector type as function argument or support the calling
- convention directly.  */
+  /* All current vector arguments and return values are passed through the
+ function stack. Ideally, we should either warn the user not to use an RVV
+ vector type as function argument or support a calling convention
+ with better performance.  */
   if (riscv_v_ext_mode_p (mode))
 return NULL_RTX;
+
   if (named)
 {
   riscv_aggregate_field fields[2];
@@ -4106,6 +4106,13 @@ riscv_pass_by_reference (cumulative_args_t cum_v, const 
function_arg_info &arg)
return false;
 }
 
+  /* All current vector arguments and return values are passed through the
+ function stack. Ideally, we should either warn the user not to use an RVV
+ vector type as function argument or support a calling convention
+ with better performance.  */
+  if (riscv_v_ext_mode_p (arg.mode))
+return true;
+
   /* Pass by reference if the data do not fit in two integer registers.  */
   return !IN_RANGE (size, 0, 2 * UNITS_PER_WORD);
 }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-1.c
new file mode 100644
index ..0edbb0626299
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-1.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv --param=riscv-autovec-preference=fixed-vlmax" 
} */
+
+#include "riscv_vector.h"
+
+typedef int8_t vnx2qi __attribute__ ((vector_size (2)));
+
+__attribute__ ((noipa)) vnx2qi
+f_vnx2qi (int8_t a, int8_t b, int8_t *out)
+{
+  vnx2qi v = {a, b};
+  return v;
+}
+
+__attribute__ ((noipa)) vnx2qi
+f_vnx2qi_2 (vnx2qi a, int8_t *out)
+{
+  return a;
+}
+
+__attribute__ ((noipa)) vint32m1_t
+f_vint32m1 (int8_t * a, int8_t *out)
+{
+  vint32m1_t v = *(vint32m1_t*)a;
+  return v;
+}
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-2.c
new file mode 100644
index ..b233ff1e9040
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gczve32x 
--param=riscv-autovec-preference=fixed-vlmax" } */
+
+#include 

0001-RISC-V-Ensure-vector-args-and-return-use-function-st.patch
Description: Binary data


Re: [PATCH] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Lehua Ding
> so this is intended to fix the PR as well as unblock while we continue
> with the preliminary ABI separately?


Yes, and I will send the new prerelease vector calling convention later.


Best,
Lehua

[PATCH V2] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Lehua Ding
The V2 patch address comments from Juzhe, thanks.

Hi,
 
The reason for this bug is that in the case where the vector register is set
to a fixed length (with `--param=riscv-autovec-preference=fixed-vlmax` option),
TARGET_PASS_BY_REFERENCE thinks that variables of type vint32m1 can be passed
through two scalar registers, but when GCC calls FUNCTION_VALUE (call function
riscv_get_arg_info inside) it returns NULL_RTX. These two functions are not
unified. The current treatment is to pass all vector arguments and returns
through the function stack, and a new calling convention for vector registers
will be added in the future.
 
Best,
Lehua

PR target/110119

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_get_arg_info): Return NULL_RTX for 
vector mode
(riscv_pass_by_reference): Return true for vector mode

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr110119-1.c: New test.
* gcc.target/riscv/rvv/base/pr110119-2.c: New test.

---
 gcc/config/riscv/riscv.cc | 17 
 .../gcc.target/riscv/rvv/base/pr110119-1.c| 26 +++
 .../gcc.target/riscv/rvv/base/pr110119-2.c| 26 +++
 3 files changed, 64 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dd5361c2bd2a..e5ae4e81b7a5 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3915,13 +3915,13 @@ riscv_get_arg_info (struct riscv_arg_info *info, const 
CUMULATIVE_ARGS *cum,
   riscv_pass_in_vector_p (type);
 }
 
-  /* TODO: Currently, it will cause an ICE for --param
- riscv-autovec-preference=fixed-vlmax. So, we just return NULL_RTX here
- let GCC generate loads/stores. Ideally, we should either warn the user not
- to use an RVV vector type as function argument or support the calling
- convention directly.  */
+  /* All current vector arguments and return values are passed through the
+ function stack. Ideally, we should either warn the user not to use an RVV
+ vector type as function argument or support a calling convention
+ with better performance.  */
   if (riscv_v_ext_mode_p (mode))
 return NULL_RTX;
+
   if (named)
 {
   riscv_aggregate_field fields[2];
@@ -4106,6 +4106,13 @@ riscv_pass_by_reference (cumulative_args_t cum_v, const 
function_arg_info &arg)
return false;
 }
 
+  /* All current vector arguments and return values are passed through the
+ function stack. Ideally, we should either warn the user not to use an RVV
+ vector type as function argument or support a calling convention
+ with better performance.  */
+  if (riscv_v_ext_mode_p (arg.mode))
+return true;
+
   /* Pass by reference if the data do not fit in two integer registers.  */
   return !IN_RANGE (size, 0, 2 * UNITS_PER_WORD);
 }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-1.c
new file mode 100644
index ..f16502bcfeec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-1.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv --param=riscv-autovec-preference=fixed-vlmax" 
} */
+
+#include "riscv_vector.h"
+
+typedef int8_t vnx2qi __attribute__ ((vector_size (2)));
+
+__attribute__ ((noipa)) vnx2qi
+f_vnx2qi (int8_t a, int8_t b, int8_t *out)
+{
+  vnx2qi v = {a, b};
+  return v;
+}
+
+__attribute__ ((noipa)) vnx2qi
+f_vnx2qi_2 (vnx2qi a, int8_t *out)
+{
+  return a;
+}
+
+__attribute__ ((noipa)) vint32m1_t
+f_vint32m1 (int8_t *a, int8_t *out)
+{
+  vint32m1_t v = *(vint32m1_t *) a;
+  return v;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-2.c
new file mode 100644
index ..b233ff1e9040
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110119-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gczve32x 
--param=riscv-autovec-preference=fixed-vlmax" } */
+
+#include 
+#include "riscv_vector.h"
+
+__attribute__ ((noipa)) vint32m1x3_t
+foo1 (int32_t *in, int vl)
+{
+  vint32m1x3_t v = __riscv_vlseg3e32_v_i32m1x3 (in, vl);
+  return v;
+}
+
+__attribute__ ((noipa)) void
+foo2 (vint32m1x3_t a, int32_t *out, int vl)
+{
+  __riscv_vsseg3e32_v_i32m1x3 (out, a, vl);
+}
+
+__attribute__ ((noipa)) vint32m1x3_t
+foo3 (vint32m1x3_t a, int32_t *out, int32_t *in, int vl)
+{
+  __riscv_vsseg3e32_v_i32m1x3 (out, a, vl);
+  vint32m1x3_t v = __riscv_vlseg3e32_v_i32m1x3 (in, vl);
+  return v;
+}
-- 
2.36.3



Re: [PATCH] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Lehua Ding
> \ No newline at end of file
> Add newline for each test.



Address this comment, below is the V2 patch link.


https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621698.html
 
Best,
Lehua


  

Re:RE: [PATCH] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Lehua Ding
> Nit for test.
> +/* { dg-options "-march=rv64gczve32x
> +--param=riscv-autovec-preference=fixed-vlmax" } */
> To
> +/* { dg-options "-march=rv64gc_zve32x 
--param=riscv-autovec-preference=fixed-vlmax" } */
Fixed in the V2 patch 
(https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621698.html), thank you.


Best,
Lehua
 

[PATCH] RISC-V: Throw compilation error for unknown sub-extension or supervisor extension

2023-07-11 Thread Lehua Ding
Hi,

This tiny patch add a check for extension starts with 'z' or 's' in `-march`
option. Currently this unknown extension will be passed to the assembler, which
then reports an error. With this patch, the compiler will throw a compilation
error if the extension starts with 'z' or 's' is not a standard sub-extension or
supervisor extension.

e.g.:

Run `riscv64-unknown-elf-gcc -march=rv64gcv_zvl128_s123 a.c` will throw these 
error:

riscv64-unknown-elf-gcc: error: '-march=rv64gcv_zvl128_s123': extension 'zvl' 
starts with `z` but is not a standard sub-extension
riscv64-unknown-elf-gcc: error: '-march=rv64gcv_zvl128_s123': extension 's123' 
start with `s` but not a standard supervisor extension

Best,
Lehua

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (standard_extensions_p): New func.
(riscv_subset_list::add): Add check.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-3.c: Update -march.
* gcc.target/riscv/arch-5.c: Ditto.
* gcc.target/riscv/arch-8.c: Ditto.
* gcc.target/riscv/attribute-10.c: Ditto.
* gcc.target/riscv/attribute-9.c: Ditto.
* gcc.target/riscv/pr102957.c: Ditto.
* gcc.target/riscv/arch-22.cc: New test.

---
 gcc/common/config/riscv/riscv-common.cc   | 29 +++
 gcc/testsuite/gcc.target/riscv/arch-22.cc |  8 +
 gcc/testsuite/gcc.target/riscv/arch-3.c   |  2 +-
 gcc/testsuite/gcc.target/riscv/arch-5.c   |  2 +-
 gcc/testsuite/gcc.target/riscv/arch-8.c   |  2 +-
 gcc/testsuite/gcc.target/riscv/attribute-10.c |  2 +-
 gcc/testsuite/gcc.target/riscv/attribute-9.c  |  4 +--
 gcc/testsuite/gcc.target/riscv/pr102957.c |  2 ++
 8 files changed, 45 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-22.cc

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 6091d8f281b..df3c256c80c 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -518,6 +518,18 @@ subset_cmp (const std::string &a, const std::string &b)
 }
 }
 
+/* Return true if EXT is a standard extension.  */
+
+static bool
+standard_extensions_p (const char *ext)
+{
+  const riscv_ext_version *ext_ver;
+  for (ext_ver = &riscv_ext_version_table[0]; ext_ver->name != NULL; ++ext_ver)
+if (strcmp (ext, ext_ver->name) == 0)
+  return true;
+  return false;
+}
+
 /* Add new subset to list.  */
 
 void
@@ -546,6 +558,23 @@ riscv_subset_list::add (const char *subset, int 
major_version,
 
   return;
 }
+  else if (subset[0] == 'z' && !standard_extensions_p (subset))
+{
+  error_at (m_loc,
+   "%<-march=%s%>: extension %qs starts with `z` but is not a "
+   "standard sub-extension",
+   m_arch, subset);
+  return;
+}
+  else if (subset[0] == 's' && !standard_extensions_p (subset))
+{
+  error_at (
+   m_loc,
+   "%<-march=%s%>: extension %qs start with `s` but not a standard "
+   "supervisor extension",
+   m_arch, subset);
+  return;
+}
 
   riscv_subset_t *s = new riscv_subset_t ();
   riscv_subset_t *itr;
diff --git a/gcc/testsuite/gcc.target/riscv/arch-22.cc 
b/gcc/testsuite/gcc.target/riscv/arch-22.cc
new file mode 100644
index 000..f9d8b57cb20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/arch-22.cc
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128_z123_s123 -mabi=lp64d" } */
+int foo()
+{
+}
+/* { dg-error "extension 'zvl128' start with `z` but not a standard 
sub-extension" "" { target *-*-* } 0 } */
+/* { dg-error "extension 'z123' start with `z` but not a standard 
sub-extension" "" { target *-*-* } 0 } */
+/* { dg-error "extension 's123' start with `s` but not a standard supervisor 
extension" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/riscv/arch-3.c 
b/gcc/testsuite/gcc.target/riscv/arch-3.c
index 7aa945eca20..dee0fc6656d 100644
--- a/gcc/testsuite/gcc.target/riscv/arch-3.c
+++ b/gcc/testsuite/gcc.target/riscv/arch-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv32isabc_xbar -mabi=ilp32" } */
+/* { dg-options "-march=rv32isvinval_xbar -mabi=ilp32" } */
 int foo()
 {
 }
diff --git a/gcc/testsuite/gcc.target/riscv/arch-5.c 
b/gcc/testsuite/gcc.target/riscv/arch-5.c
index 8258552214f..8bdaa9d17b2 100644
--- a/gcc/testsuite/gcc.target/riscv/arch-5.c
+++ b/gcc/testsuite/gcc.target/riscv/arch-5.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv32i_zfoo_sabc_xbar -mabi=ilp32" } */
+/* { dg-options "-march=rv32i_zmmul_svnapot_xbar -mabi=ilp32" } */
 int foo()
 {
 }
diff --git a/gcc/testsuite/gcc.target/riscv/arch-8.c 
b/gcc/testsuite/gcc.target/riscv/arch-8.c
index 1b9e51b0e12..ef557aeb673 100644
--- a/gcc/testsuite/gcc.target/riscv/arch-8.c
+++ b/gcc/testsuite/gcc.target/riscv/arch-8.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv32id_zicsr_zifence -mabi=ilp

[PATCH] mklog: Add --append option to auto add generate ChangeLog to patch file

2023-07-11 Thread Lehua Ding
Hi,

This tiny patch add --append option to mklog.py that support add generated
ChangeLog to the corresponding patch file. With this option there is no need
to manually copy the generated ChangeLog to the patch file. e.g.:

Run `mklog.py -a /path/to/this/patch` will add the generated ChangeLog

```
contrib/ChangeLog:

* mklog.py:
```

to the right place of the /path/to/this/patch file.

Best,
Lehua

contrib/ChangeLog:

* mklog.py: Add --append option.

---
 contrib/mklog.py | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 777212c98d7..26230b9b4f2 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -358,6 +358,8 @@ if __name__ == '__main__':
  'file')
 parser.add_argument('--update-copyright', action='store_true',
 help='Update copyright in ChangeLog files')
+parser.add_argument('-a', '--append', action='store_true',
+help='Append the generate ChangeLog to the patch file')
 args = parser.parse_args()
 if args.input == '-':
 args.input = None
@@ -370,7 +372,30 @@ if __name__ == '__main__':
 else:
 output = generate_changelog(data, args.no_functions,
 args.fill_up_bug_titles, args.pr_numbers)
-if args.changelog:
+if args.append:
+if (not args.input):
+raise Exception("`-a or --append` option not support standard 
input")
+lines = []
+with open(args.input, 'r', newline='\n') as f:
+# 1 -> not find the possible start of diff log
+# 2 -> find the possible start of diff log
+# 3 -> finish add ChangeLog to the patch file
+maybe_diff_log = 1
+for line in f:
+if maybe_diff_log == 1 and line == "---\n":
+maybe_diff_log = 2
+elif maybe_diff_log == 2 and \
+ re.match("\s[^\s]+\s+\|\s\d+\s[+\-]+\n", line):
+lines += [output, "---\n", line]
+maybe_diff_log = 3
+else:
+# the possible start is not the true start.
+if maybe_diff_log == 2:
+maybe_diff_log = 1
+lines.append(line)
+with open(args.input, "w") as f:
+f.writelines(lines)
+elif args.changelog:
 lines = open(args.changelog).read().split('\n')
 start = list(takewhile(skip_line_in_changelog, lines))
 end = lines[len(start):]
-- 
2.36.1



Re: [PATCH] mklog: Add --append option to auto add generate ChangeLog to patch file

2023-07-12 Thread Lehua Ding
Commited to the trunk, thanks Jeff.
 
 
-- Original --
From:  "Jeff Law"

[PATCH V2] RISC-V: Throw compilation error for unknown sub-extension or supervisor extension

2023-07-13 Thread Lehua Ding
Hi,

This tiny patch add a check for extension starts with 'z' or 's' in `-march`
option. Currently this unknown extension will be passed to the assembler, which
then reports an error. With this patch, the compiler will throw a compilation
error if the extension starts with 'z' or 's' is not a standard sub-extension or
supervisor extension. Along with two extra changes. The first is to reduce
repeated errors, which are currently reported at least twice. The second is to
report as many mistakes as possible.

e.g.:

Run `riscv64-unknown-elf-gcc -march=rv64gvcw_zvl128_s123_x123 -mabi=lp64d a.c`
will throw these error:

riscv64-unknown-elf-gcc: error: '-march=rv64gcv_zvl128_s123': ISA string is not 
in canonical order. 'c'
riscv64-unknown-elf-gcc: error: '-march=rv64gcv_zvl128_s123': extension 'w' is 
unsupported standard single letter extension
riscv64-unknown-elf-gcc: error: '-march=rv64gcv_zvl128_s123': extension 
'zvl128' start with `z` but is unsupported standard extension
riscv64-unknown-elf-gcc: error: '-march=rv64gcv_zvl128_s123': extension 's123' 
start with `s` but is unsupported standard supervisor extension
riscv64-unknown-elf-gcc: error: '-march=rv64gcv_zvl128_s123': extension 'x123' 
start with `x` but is unsupported non-standard extension

Best,
Lehua

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_supported_std_ext): Init.
(standard_extensions_p): Add check.
(riscv_subset_list::add): Just return NULL if it failed before.
(riscv_subset_list::parse_std_ext): Continue parse when find a error
(riscv_subset_list::parse): Just return NULL if it failed before.
* config/riscv/riscv-subset.h (class riscv_subset_list): Add field.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-2.c: Update -march.
* gcc.target/riscv/arch-3.c: Ditto.
* gcc.target/riscv/arch-5.c: Ditto.
* gcc.target/riscv/arch-8.c: Ditto.
* gcc.target/riscv/attribute-10.c: Ditto.
* gcc.target/riscv/attribute-18.c: Ditto.
* gcc.target/riscv/attribute-19.c: Ditto.
* gcc.target/riscv/attribute-8.c: Ditto.
* gcc.target/riscv/attribute-9.c: Ditto.
* gcc.target/riscv/pr102957.c: Ditto.
* gcc.target/riscv/arch-22.cc: New test.

---
 gcc/common/config/riscv/riscv-common.cc   | 68 +++
 gcc/config/riscv/riscv-subset.h   |  5 ++
 gcc/testsuite/gcc.target/riscv/arch-2.c   |  2 +-
 gcc/testsuite/gcc.target/riscv/arch-22.cc | 11 +++
 gcc/testsuite/gcc.target/riscv/arch-3.c   |  2 +-
 gcc/testsuite/gcc.target/riscv/arch-5.c   |  2 +-
 gcc/testsuite/gcc.target/riscv/arch-8.c   |  2 +-
 gcc/testsuite/gcc.target/riscv/attribute-10.c |  2 +-
 gcc/testsuite/gcc.target/riscv/attribute-18.c |  4 +-
 gcc/testsuite/gcc.target/riscv/attribute-19.c |  4 +-
 gcc/testsuite/gcc.target/riscv/attribute-8.c  |  4 +-
 gcc/testsuite/gcc.target/riscv/attribute-9.c  |  4 +-
 gcc/testsuite/gcc.target/riscv/pr102957.c |  2 +
 13 files changed, 87 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-22.cc

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 6091d8f281b..9de7c54269e 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -311,6 +311,8 @@ static const char *riscv_tunes[] =
 
 static const char *riscv_supported_std_ext (void);
 
+bool riscv_subset_list::parse_failed = false;
+
 static riscv_subset_list *current_subset_list = NULL;
 
 const riscv_subset_list *riscv_current_subset_list ()
@@ -518,6 +520,18 @@ subset_cmp (const std::string &a, const std::string &b)
 }
 }
 
+/* Return true if EXT is a standard extension.  */
+
+static bool
+standard_extensions_p (const char *ext)
+{
+  const riscv_ext_version *ext_ver;
+  for (ext_ver = &riscv_ext_version_table[0]; ext_ver->name != NULL; ++ext_ver)
+if (strcmp (ext, ext_ver->name) == 0)
+  return true;
+  return false;
+}
+
 /* Add new subset to list.  */
 
 void
@@ -546,6 +560,38 @@ riscv_subset_list::add (const char *subset, int 
major_version,
 
   return;
 }
+  else if (strlen (subset) == 1 && !standard_extensions_p (subset))
+{
+  error_at (m_loc,
+   "%<-march=%s%>: extension %qs is unsupported standard single "
+   "letter extension",
+   m_arch, subset);
+  return;
+}
+  else if (subset[0] == 'z' && !standard_extensions_p (subset))
+{
+  error_at (m_loc,
+   "%<-march=%s%>: extension %qs starts with `z` but is "
+   "unsupported standard extension",
+   m_arch, subset);
+  return;
+}
+  else if (subset[0] == 's' && !standard_extensions_p (subset))
+{
+  error_at (m_loc,
+   "%<-march=%s%>: extension %qs start with `s` but is "
+   "unsupported standard supervisor extension",
+   m_arch, subset);
+  return;
+  

Re: [PATCH] RISC-V: Throw compilation error for unknown sub-extension or supervisor extension

2023-07-13 Thread Lehua Ding
Thanks for review. I uploaded version V2, which addresses Kito's comments,
along with two changes. The first is to reduce repeated errors, which are 
currently
reported at least twice. The second is to report as many mistakes as possible.


V2 URL: https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624377.html


Best,
Lehua
 
-- Original --
From:  "Kito Cheng"https://github.com/riscv-non-isa/riscv-c-api-doc/blob/master/riscv-c-api.md#architecture-extension-test-macro
[2] 
https://github.com/riscv/riscv-isa-manual/blob/main/src/naming.adoc#additional-standard-extension-names

[PATCH] RISC-V: Ensure all implied extensions are included[PR110696]

2023-07-17 Thread Lehua Ding
Hi,

This patch fix target/PR110696, recursively add all implied extensions.

Best,
Lehua

PR target/110696

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc 
(riscv_subset_list::handle_implied_ext): recur add all implied extensions.
(riscv_subset_list::check_implied_ext): Add new method.
(riscv_subset_list::parse): Call checker check_implied_ext.
* config/riscv/riscv-subset.h: Add new method.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/attribute-20.c: New test.
* gcc.target/riscv/pr110696.c: New test.

---
 gcc/common/config/riscv/riscv-common.cc   | 33 +--
 gcc/config/riscv/riscv-subset.h   |  3 +-
 gcc/testsuite/gcc.target/riscv/attribute-20.c |  7 
 gcc/testsuite/gcc.target/riscv/pr110696.c |  7 
 4 files changed, 46 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-20.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr110696.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 28c8f0c1489..19075c0b241 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -949,14 +949,14 @@ riscv_subset_list::parse_std_ext (const char *p)
 
 /* Check any implied extensions for EXT.  */
 void
-riscv_subset_list::handle_implied_ext (riscv_subset_t *ext)
+riscv_subset_list::handle_implied_ext (const char *ext)
 {
   const riscv_implied_info_t *implied_info;
   for (implied_info = &riscv_implied_info[0];
implied_info->ext;
++implied_info)
 {
-  if (strcmp (ext->name.c_str (), implied_info->ext) != 0)
+  if (strcmp (ext, implied_info->ext) != 0)
continue;
 
   /* Skip if implied extension already present.  */
@@ -966,6 +966,9 @@ riscv_subset_list::handle_implied_ext (riscv_subset_t *ext)
   /* Version of implied extension will get from current ISA spec
 version.  */
   add (implied_info->implied_ext, true);
+
+  /* Recursively add implied extension by implied_info->implied_ext.  */
+  handle_implied_ext (implied_info->implied_ext);
 }
 
   /* For RISC-V ISA version 2.2 or earlier version, zicsr and zifence is
@@ -980,6 +983,27 @@ riscv_subset_list::handle_implied_ext (riscv_subset_t *ext)
 }
 }
 
+/* Check that all implied extensions are included.  */
+bool
+riscv_subset_list::check_implied_ext ()
+{
+  riscv_subset_t *itr;
+  for (itr = m_head; itr != NULL; itr = itr->next)
+{
+  const riscv_implied_info_t *implied_info;
+  for (implied_info = &riscv_implied_info[0]; implied_info->ext;
+  ++implied_info)
+   {
+ if (strcmp (itr->name.c_str(), implied_info->ext) != 0)
+   continue;
+
+ if (!lookup (implied_info->implied_ext))
+   return false;
+   }
+}
+  return true;
+}
+
 /* Check any combine extensions for EXT.  */
 void
 riscv_subset_list::handle_combine_ext ()
@@ -1194,9 +1218,12 @@ riscv_subset_list::parse (const char *arch, location_t 
loc)
 
   for (itr = subset_list->m_head; itr != NULL; itr = itr->next)
 {
-  subset_list->handle_implied_ext (itr);
+  subset_list->handle_implied_ext (itr->name.c_str ());
 }
 
+  /* Make sure all implied extensions are included. */
+  gcc_assert (subset_list->check_implied_ext ());
+
   subset_list->handle_combine_ext ();
 
   if (subset_list->lookup ("zfinx") && subset_list->lookup ("f"))
diff --git a/gcc/config/riscv/riscv-subset.h b/gcc/config/riscv/riscv-subset.h
index 92e4fb31692..84a7a82db63 100644
--- a/gcc/config/riscv/riscv-subset.h
+++ b/gcc/config/riscv/riscv-subset.h
@@ -67,7 +67,8 @@ private:
   const char *parse_multiletter_ext (const char *, const char *,
 const char *);
 
-  void handle_implied_ext (riscv_subset_t *);
+  void handle_implied_ext (const char *);
+  bool check_implied_ext ();
   void handle_combine_ext ();
 
 public:
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-20.c 
b/gcc/testsuite/gcc.target/riscv/attribute-20.c
new file mode 100644
index 000..f7d0b29b71c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/attribute-20.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl65536b -mabi=lp64d" } */
+int foo()
+{
+}
+
+/* { dg-final { scan-assembler ".attribute arch, 
\"rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl1024b1p0_zvl128b1p0_zvl16384b1p0_zvl2048b1p0_zvl256b1p0_zvl32768b1p0_zvl32b1p0_zvl4096b1p0_zvl512b1p0_zvl64b1p0_zvl65536b1p0_zvl8192b1p0\""
 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/pr110696.c 
b/gcc/testsuite/gcc.target/riscv/pr110696.c
new file mode 100644
index 000..a630f04e74f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr110696.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl4096b -mabi=lp64d" } */
+int foo()
+{
+}
+
+/* { dg-final { scan-assembl

Re: [PATCH] RISC-V: Ensure all implied extensions are included[PR110696]

2023-07-17 Thread Lehua Ding
Commited to the trunk, thank you.
 
 
-- Original --
From:  "Kito Cheng"

Re:Fw: [PATCH V2] RTL_SSA: Relax PHI_MODE in phi_setup

2023-07-17 Thread Lehua Ding
Committed to the trunk, thanks Richard and Juzhe.


1. bootstrap and regression are pass on i386 target (by Pan).
2. no new failed testcases on AArch64 target.


Best,
Lehua


-- Original --
From:   
 "Richard Sandiford"



[PATCH] RISC-V: Remove testcase that cannot be compiled because VLEN limitation

2023-07-17 Thread Lehua Ding
Hi,

Since the latter patch 
(https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624689.html)
forbidden VLEN > 4096, the testcase attribute-20.c is no long need. This is 
obvious.

Best,
Lehua

gcc/testsuite/ChangeLog:

* gcc.target/riscv/attribute-20.c: Removed.

---
 gcc/testsuite/gcc.target/riscv/attribute-20.c | 7 ---
 1 file changed, 7 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/riscv/attribute-20.c

diff --git a/gcc/testsuite/gcc.target/riscv/attribute-20.c 
b/gcc/testsuite/gcc.target/riscv/attribute-20.c
deleted file mode 100644
index f7d0b29b71c..000
--- a/gcc/testsuite/gcc.target/riscv/attribute-20.c
+++ /dev/null
@@ -1,7 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-march=rv64gcv_zvl65536b -mabi=lp64d" } */
-int foo()
-{
-}
-
-/* { dg-final { scan-assembler ".attribute arch, 
\"rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl1024b1p0_zvl128b1p0_zvl16384b1p0_zvl2048b1p0_zvl256b1p0_zvl32768b1p0_zvl32b1p0_zvl4096b1p0_zvl512b1p0_zvl64b1p0_zvl65536b1p0_zvl8192b1p0\""
 } } */
-- 
2.36.3



Re: [PATCH] RISC-V: Remove testcase that cannot be compiled because VLEN limitation

2023-07-18 Thread Lehua Ding
Committed to the trunk, thank you.



 




-- Original --
From:   
 "juzhe.zh...@rivai.ai" 
   
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624689.html)
forbidden VLEN > 4096, the testcase attribute-20.c is no long need. This is 
obvious.
 
Best,
Lehua
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/attribute-20.c: Removed.
 
---
gcc/testsuite/gcc.target/riscv/attribute-20.c | 7 ---
1 file changed, 7 deletions(-)
delete mode 100644 gcc/testsuite/gcc.target/riscv/attribute-20.c
 
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-20.c 
b/gcc/testsuite/gcc.target/riscv/attribute-20.c
deleted file mode 100644
index f7d0b29b71c..000
--- a/gcc/testsuite/gcc.target/riscv/attribute-20.c
+++ /dev/null
@@ -1,7 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-march=rv64gcv_zvl65536b -mabi=lp64d" } */
-int foo()
-{
-}
-
-/* { dg-final { scan-assembler ".attribute arch, 
\"rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl1024b1p0_zvl128b1p0_zvl16384b1p0_zvl2048b1p0_zvl256b1p0_zvl32768b1p0_zvl32b1p0_zvl4096b1p0_zvl512b1p0_zvl64b1p0_zvl65536b1p0_zvl8192b1p0\""
 } } */
-- 
2.36.3
 

[PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Lehua Ding
Hi,

This patch fix testcase failed when I build RISC-V GCC with -mcmodel=medany
as default. If set to medany, stack_save_restore.c testcase will fail because of
the reduced use of s3 registers in assembly (thus calling __riscv_save/store_3
instead of __riscv_save/store_4). Explicitly add -mcmodel=medlow to solve this
problem.

Best,
Lehua

gcc/testsuite/ChangeLog:

* gcc.target/riscv/stack_save_restore.c: Add -mcmodel=medlow

---
 gcc/testsuite/gcc.target/riscv/stack_save_restore.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/riscv/stack_save_restore.c 
b/gcc/testsuite/gcc.target/riscv/stack_save_restore.c
index 522e706cfbf..a2430783474 100644
--- a/gcc/testsuite/gcc.target/riscv/stack_save_restore.c
+++ b/gcc/testsuite/gcc.target/riscv/stack_save_restore.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv32imafc -mabi=ilp32f -msave-restore -O2 
-fno-schedule-insns -fno-schedule-insns2 -fno-unroll-loops -fno-peel-loops 
-fno-lto" } */
+/* { dg-options "-march=rv32imafc -mabi=ilp32f -msave-restore -O2 
-fno-schedule-insns -fno-schedule-insns2 -fno-unroll-loops -fno-peel-loops 
-fno-lto -mcmodel=medlow" } */
 /* { dg-final { check-function-bodies "**" "" } } */
 
 char my_getchar();
-- 
2.36.1



Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Lehua Ding
Hi Robin,


> Wouldn't you rather want to adjust the test to not check for one 
register
> number but 3 or 4 instead?

I think the purpose of this testcase is to check whether the modifications to
the stack frame are as expected, so it is necessary to specify exactly whether
three or four registers are saved. But I think its need to add another testcase
which use another option -mcmodel=medany for coverage.


> There might be future changes in default behavior
> that would invalidate the test as well.

Because -mcmodel is explicitly specified in the testcase, future changes
to the default value of -mcmodel will not cause the test case to fail.


Best,
Lehua

Re: [PATCH V2] RISC-V: Enable SLP un-order reduction

2023-07-18 Thread Lehua Ding
Committed to the trunk, thanks Robin.

Re: [PATCH] RISC-V: Dynamic adjust size of VLA vector according to TARGET_MIN_VLEN

2023-07-18 Thread Lehua Ding
> LGTM, thanks:)


Committed to the trunk, thanks Kito and Juzhe.

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Lehua Ding
Hi Robin,

> In general I'm fine with this small change of course, I just wonder if
> the testcase is not brittle anyway. From what I can tell the respective
> change is independent of the actual number of registers so maybe it's enough 
> to
> not compare the fully body but just make sure the addis are not present?
> That way, the test could also work for -march=rv64 (which saves one
> register less anyway regardless of mcmodel - but the change still helps)
> or maybe even with instruction scheduling.  Would you mind checking this 
> still?

I think you are rigth, I would like to remove the `-mcmodel=medany` option and
relax assert from `__riscv_save/restore_4` to `__riscv_save/restore_(3|4)` to 
let
this testcase not brittle on any -mcmodel.  Then I'm also going to add another
testcase (I dont known how to run -march=rv32imafc and -march=rv64imafc on
the same testcase) that uses -march=rv64imafc.

Removing scheduling option will result in a change in the order of the assert
assembly, and I don't feel like removing it because the order may be different 
for
different microarchitectures.

Best,
Lehua

V2 patch:

gcc/testsuite/ChangeLog:

* gcc.target/riscv/stack_save_restore.c: Moved to...
* gcc.target/riscv/stack_save_restore_2.c: ...here.
* gcc.target/riscv/stack_save_restore_1.c: New test.

---
 .../gcc.target/riscv/stack_save_restore_1.c   | 40 +++
 ..._save_restore.c => stack_save_restore_2.c} |  6 +--
 2 files changed, 43 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/stack_save_restore_1.c
 rename gcc/testsuite/gcc.target/riscv/{stack_save_restore.c => 
stack_save_restore_2.c} (90%)

diff --git a/gcc/testsuite/gcc.target/riscv/stack_save_restore_1.c 
b/gcc/testsuite/gcc.target/riscv/stack_save_restore_1.c
new file mode 100644
index 000..255ce5f40c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/stack_save_restore_1.c
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64imafc -mabi=lp64f -msave-restore -O2 
-fno-schedule-insns -fno-schedule-insns2 -fno-unroll-loops -fno-peel-loops 
-fno-lto" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+char my_getchar();
+float getf();
+
+/*
+** bar:
+** callt0,__riscv_save_(3|4)
+** addisp,sp,-2032
+** ...
+** li  t0,-12288
+** add sp,sp,t0
+** ...
+** li  t0,12288
+** add sp,sp,t0
+** ...
+** addisp,sp,2032
+** tail__riscv_restore_(3|4)
+*/
+int bar()
+{
+  float volatile farray[3568];
+
+  float sum = 0;
+  float f1 = getf();
+  float f2 = getf();
+  float f3 = getf();
+  float f4 = getf();
+
+  for (int i = 0; i < 3568; i++)
+  {
+farray[i] = my_getchar() * 1.2;
+sum += farray[i];
+  }
+
+  return sum + f1 + f2 + f3 + f4;
+}
+
diff --git a/gcc/testsuite/gcc.target/riscv/stack_save_restore.c 
b/gcc/testsuite/gcc.target/riscv/stack_save_restore_2.c
similarity index 90%
rename from gcc/testsuite/gcc.target/riscv/stack_save_restore.c
rename to gcc/testsuite/gcc.target/riscv/stack_save_restore_2.c
index 522e706cfbf..4ce5e0118a4 100644
--- a/gcc/testsuite/gcc.target/riscv/stack_save_restore.c
+++ b/gcc/testsuite/gcc.target/riscv/stack_save_restore_2.c
@@ -6,8 +6,8 @@ char my_getchar();
 float getf();
 
 /*
-**bar:
-** callt0,__riscv_save_4
+** bar:
+** callt0,__riscv_save_(3|4)
 ** addisp,sp,-2032
 ** ...
 ** li  t0,-12288
@@ -17,7 +17,7 @@ float getf();
 ** add sp,sp,t0
 ** ...
 ** addisp,sp,2032
-** tail__riscv_restore_4
+** tail__riscv_restore_(3|4)
 */
 int bar()
 {
-- 
2.36.3



Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-19 Thread Lehua Ding
Committed V2 patch, thank you so much.




-- Original --
From:   
 "Robin Dapp"   
 


Re: [PATCH V2] RISC-V: Throw compilation error for unknown sub-extension or supervisor extension

2023-07-19 Thread Lehua Ding
Commited to the trunk, thank you so much.

[PATCH] mklog: fix bugs of --append option

2023-07-19 Thread Lehua Ding
Hi,

This little patch fix two bugs of mklog.py with --append option.
The first bug is that the regexp used is not accurate enough to
determine the top of diff area. The second bug is that if `---`
is not a true start, it needs to be added back to the patch file.

contrib/ChangeLog:

* mklog.py: Fix regexp and add missed `---`

---
 contrib/mklog.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 26230b9b4f2..bd81c5ba92c 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -385,12 +385,13 @@ if __name__ == '__main__':
 if maybe_diff_log == 1 and line == "---\n":
 maybe_diff_log = 2
 elif maybe_diff_log == 2 and \
- re.match("\s[^\s]+\s+\|\s\d+\s[+\-]+\n", line):
+ re.match("\s[^\s]+\s+\|\s+\d+\s[\+\-]+\n", line):
 lines += [output, "---\n", line]
 maybe_diff_log = 3
 else:
 # the possible start is not the true start.
 if maybe_diff_log == 2:
+lines.append("---\n")
 maybe_diff_log = 1
 lines.append(line)
 with open(args.input, "w") as f:
-- 
2.36.1



[PATCH 1/3] RISC-V: Part-1: Select suitable vector registers for vector type args and returns

2023-07-20 Thread Lehua Ding
I have posted below the vector register calling convention rules from in the
proposal[1]:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/389

gcc/ChangeLog:

* config/riscv/riscv-protos.h (builtin_type_p): New function for 
checking vector type.
* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
(riscv_init_cumulative_args): Setup variant_cc field.
(riscv_vector_type_p): New function for checking vector type.
(riscv_hard_regno_nregs): Hoist declare.
(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
(riscv_get_arg_info): Support vector cc.
(riscv_function_arg_advance): Update cum.
(riscv_pass_by_reference): Handle vector args.
(riscv_v_abi): New function return vector abi.
(riscv_return_value_is_vector_type_p): New function for check vector 
arguments.
(riscv_arguments_is_vector_type_p): New function for check vector 
returns.
(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
(V_ARG_FIRST): Ditto.
(V_ARG_LAST): Ditto.
(enum riscv_cc): Define all RISCV_CC variants.
* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
 

[PATCH 0/3] RISC-V: Add an experimental vector calling convention

2023-07-20 Thread Lehua Ding
Hi RISC-V folks,

This patch implement the proposal of RISC-V vector calling convention[1] and
this feature can be enabled by `--param=riscv-vector-abi` option. Currently,
all vector type arguments and return values are pass by reference. With this
patch, these arguments and return values can pass through vector registers.
Currently only vector types defined in the RISC-V Vector Extension Intrinsic 
Document[2]
are supported. GNU-ext vector types are unsupported for now since the
corresponding proposal was not presented.

The proposal introduce a new calling convention variant, functions which follow
this variant need follow the bellow vector register convention.

| Name| ABI Mnemonic | Meaning  | Preserved across 
calls?
=
| v0  |  | Argument register| No
| v1-v7   |  | Callee-saved registers   | Yes
| v8-v23  |  | Argument registers   | No
| v24-v31 |  | Callee-saved registers   | Yes

If a functions follow this vector calling convention, then the function symbole
must be annotated with .variant_cc directive[3] (used to indicate that it is a
calling convention variant).

This implementation split into three parts, each part corresponds to a 
sub-patch.

- Part-1: Select suitable vector regsiters for vector type arguments and return
  values according to the proposal.
- Part-2: Allocate frame area for callee-saved vector registers and save/restore
  them in prologue and epilogue.
- Part-3: Generate .variant_cc directive for vector function in assembly code.

Best,
Lehua

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/389
[2] 
https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#type-system
[3] 
https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md#pseudo-ops

Lehua Ding (3):
  RISC-V: Part-1: Select suitable vector registers for vector type args
and returns
  RISC-V: Part-2: Save/Restore vector registers which need to be
preversed
  RISC-V: Part-3: Output .variant_cc directive for vector function

 gcc/config/riscv/riscv-protos.h   |   4 +
 gcc/config/riscv/riscv-sr.cc  |  12 +-
 gcc/config/riscv/riscv-vector-builtins.cc |  10 +
 gcc/config/riscv/riscv.cc | 510 --
 gcc/config/riscv/riscv.h  |  40 ++
 gcc/config/riscv/riscv.md |  43 +-
 gcc/config/riscv/riscv.opt|   5 +
 .../riscv/rvv/base/abi-call-args-1-run.c  | 127 +
 .../riscv/rvv/base/abi-call-args-1.c  | 197 +++
 .../riscv/rvv/base/abi-call-args-2-run.c  |  34 ++
 .../riscv/rvv/base/abi-call-args-2.c  |  27 +
 .../riscv/rvv/base/abi-call-args-3-run.c  | 260 +
 .../riscv/rvv/base/abi-call-args-3.c  | 116 
 .../riscv/rvv/base/abi-call-args-4-run.c  | 145 +
 .../riscv/rvv/base/abi-call-args-4.c  | 111 
 .../riscv/rvv/base/abi-call-error-1.c |  11 +
 .../riscv/rvv/base/abi-call-return-run.c  | 127 +
 .../riscv/rvv/base/abi-call-return.c  | 197 +++
 .../riscv/rvv/base/abi-call-variant_cc.c  |  39 ++
 .../rvv/base/abi-callee-saved-1-fixed-1.c |  85 +++
 .../rvv/base/abi-callee-saved-1-fixed-2.c |  85 +++
 .../riscv/rvv/base/abi-callee-saved-1.c   |  87 +++
 .../riscv/rvv/base/abi-callee-saved-2.c   | 117 
 23 files changed, 2327 insertions(+), 62 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-1-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-2-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-3-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-4-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-error-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-return-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-return.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-variant_cc.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-2.c

-- 
2.36.3



[PATCH 3/3] RISC-V: Part-3: Output .variant_cc directive for vector function

2023-07-20 Thread Lehua Ding
Functions which follow vector calling convention variant need be annotated by
.variant_cc directive according the RISC-V Assembly Programmer's Manual and
RISC-V ELF Specification[2].

[1] 
https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md#pseudo-ops
[2] 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc#dynamic-linking

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_declare_function_name): Add protos.
(riscv_asm_output_alias): Ditto.
(riscv_asm_output_external): Ditto.
* config/riscv/riscv.cc (riscv_asm_output_variant_cc):  Output 
.variant_cc directive for vector function.
(riscv_declare_function_name): Ditto.
(riscv_asm_output_alias): Ditto.
(riscv_asm_output_external): Ditto.
* config/riscv/riscv.h (ASM_DECLARE_FUNCTION_NAME): Implement 
ASM_DECLARE_FUNCTION_NAME.
(ASM_OUTPUT_DEF_FROM_DECLS): Implement ASM_OUTPUT_DEF_FROM_DECLS.
(ASM_OUTPUT_EXTERNAL): Implement ASM_OUTPUT_EXTERNAL.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-call-variant_cc.c: New test.

---
 gcc/config/riscv/riscv-protos.h   |  3 ++
 gcc/config/riscv/riscv.cc | 48 +++
 gcc/config/riscv/riscv.h  | 15 ++
 .../riscv/rvv/base/abi-call-variant_cc.c  | 39 +++
 4 files changed, 105 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-variant_cc.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 52e15e1b5d6..eb62eb46f55 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -101,6 +101,9 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
 extern void riscv_split_doubleword_move (rtx, rtx);
 extern const char *riscv_output_move (rtx, rtx);
 extern const char *riscv_output_return ();
+extern void riscv_declare_function_name (FILE *, const char *, tree);
+extern void riscv_asm_output_alias (FILE *, const tree, const tree);
+extern void riscv_asm_output_external (FILE *, const tree, const char *);
 
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 1ca3ed42d40..c8879659f1f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6740,6 +6740,54 @@ riscv_issue_rate (void)
   return tune_param->issue_rate;
 }
 
+/* Output .variant_cc for function symbol which follows vector calling
+   convention.  */
+
+static void
+riscv_asm_output_variant_cc (FILE *stream, const tree decl, const char *name)
+{
+  if (TREE_CODE (decl) == FUNCTION_DECL)
+{
+  riscv_cc cc = (riscv_cc) fndecl_abi (decl).id ();
+  if (cc == RISCV_CC_V)
+   {
+ fprintf (stream, "\t.variant_cc\t");
+ assemble_name (stream, name);
+ fprintf (stream, "\n");
+   }
+}
+}
+
+/* Implement ASM_DECLARE_FUNCTION_NAME.  */
+
+void
+riscv_declare_function_name (FILE *stream, const char *name, tree fndecl)
+{
+  riscv_asm_output_variant_cc (stream, fndecl, name);
+  ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
+  ASM_OUTPUT_LABEL (stream, name);
+}
+
+/* Implement ASM_OUTPUT_DEF_FROM_DECLS.  */
+
+void
+riscv_asm_output_alias (FILE *stream, const tree decl, const tree target)
+{
+  const char *name = XSTR (XEXP (DECL_RTL (decl), 0), 0);
+  const char *value = IDENTIFIER_POINTER (target);
+  riscv_asm_output_variant_cc (stream, decl, name);
+  ASM_OUTPUT_DEF (stream, name, value);
+}
+
+/* Implement ASM_OUTPUT_EXTERNAL.  */
+
+void
+riscv_asm_output_external (FILE *stream, tree decl, const char *name)
+{
+  default_elf_asm_output_external (stream, decl, name);
+  riscv_asm_output_variant_cc (stream, decl, name);
+}
+
 /* Auxiliary function to emit RISC-V ELF attribute. */
 static void
 riscv_emit_attribute ()
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index b24b240dd75..1820593bab5 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -1021,6 +1021,21 @@ while (0)
 
 #define ASM_COMMENT_START "#"
 
+/* Add output .variant_cc directive for specific function definition.  */
+#undef ASM_DECLARE_FUNCTION_NAME
+#define ASM_DECLARE_FUNCTION_NAME(STR, NAME, DECL) 
\
+  riscv_declare_function_name (STR, NAME, DECL)
+
+/* Add output .variant_cc directive for specific alias definition.  */
+#undef ASM_OUTPUT_DEF_FROM_DECLS
+#define ASM_OUTPUT_DEF_FROM_DECLS(STR, DECL, TARGET)   
\
+  riscv_asm_output_alias (STR, DECL, TARGET)
+
+/* Add output .variant_cc directive for specific extern function.  */
+#undef ASM_OUTPUT_EXTERNAL
+#define ASM_OUTPUT_EXTERNAL(STR, DECL, NAME)   
\
+  riscv_asm_output_external (STR, DECL, NAME)
+
 #undef SIZE_TYPE
 #define SIZE_TYPE (POINTER_SIZE == 64 ? "long unsigned int" : "unsigned int")
 
diff --git a/gcc/testsuite/gcc.target/riscv/r

[PATCH 2/3] RISC-V: Part-2: Save/Restore vector registers which need to be preversed

2023-07-20 Thread Lehua Ding
Because functions which follow vector calling convention variant has
callee-saved vector reigsters but functions which follow standard calling
convention don't have. We need to distinguish which function callee is so that
we can tell GCC exactly which vector registers callee will clobber. So I encode
the callee's calling convention information into the calls rtx pattern like
AArch64. The old operand 2 and 3 of call pattern which copy from MIPS target are
useless and removed according to my analysis.

gcc/ChangeLog:

* config/riscv/riscv-sr.cc (riscv_remove_unneeded_save_restore_calls): 
Pass riscv_cc.
* config/riscv/riscv.cc (struct riscv_frame_info): Add new fileds.
(riscv_frame_info::reset): Reset new fileds.
(riscv_call_tls_get_addr): Pass riscv_cc.
(riscv_function_arg): Return riscv_cc for call patterm.
(riscv_insn_callee_abi): Implement TARGET_INSN_CALLEE_ABI.
(riscv_save_reg_p): Add vector callee-saved check.
(riscv_save_libcall_count): Add vector save area.
(riscv_compute_frame_info): Ditto.
(riscv_restore_reg): Update for type change.
(riscv_for_each_saved_v_reg): New function save vector registers.
(riscv_first_stack_step): Handle funciton with vector callee-saved 
registers.
(riscv_expand_prologue): Ditto.
(riscv_expand_epilogue): Ditto.
(riscv_output_mi_thunk): Pass riscv_cc.
(TARGET_INSN_CALLEE_ABI): Implement TARGET_INSN_CALLEE_ABI.
* config/riscv/riscv.md: Add CALLEE_CC operand for call pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-1.c: New test.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-2.c: New test.
* gcc.target/riscv/rvv/base/abi-callee-saved-1.c: New test.
* gcc.target/riscv/rvv/base/abi-callee-saved-2.c: New test.

---
 gcc/config/riscv/riscv-sr.cc  |  12 +-
 gcc/config/riscv/riscv.cc | 222 +++---
 gcc/config/riscv/riscv.md |  43 +++-
 .../rvv/base/abi-callee-saved-1-fixed-1.c |  85 +++
 .../rvv/base/abi-callee-saved-1-fixed-2.c |  85 +++
 .../riscv/rvv/base/abi-callee-saved-1.c   |  87 +++
 .../riscv/rvv/base/abi-callee-saved-2.c   | 117 +
 7 files changed, 606 insertions(+), 45 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-2.c

diff --git a/gcc/config/riscv/riscv-sr.cc b/gcc/config/riscv/riscv-sr.cc
index 7248f04d68f..e6e17685df5 100644
--- a/gcc/config/riscv/riscv-sr.cc
+++ b/gcc/config/riscv/riscv-sr.cc
@@ -447,12 +447,18 @@ riscv_remove_unneeded_save_restore_calls (void)
   && !SIBCALL_REG_P (REGNO (target)))
 return;
 
+  /* Extract RISCV CC from the UNSPEC rtx.  */
+  rtx unspec = XVECEXP (callpat, 0, 1);
+  gcc_assert (GET_CODE (unspec) == UNSPEC
+ && XINT (unspec, 1) == UNSPEC_CALLEE_CC);
+  riscv_cc cc = (riscv_cc) INTVAL (XVECEXP (unspec, 0, 0));
   rtx sibcall = NULL;
   if (set_target != NULL)
-sibcall
-  = gen_sibcall_value_internal (set_target, target, const0_rtx);
+sibcall = gen_sibcall_value_internal (set_target, target, const0_rtx,
+ gen_int_mode (cc, SImode));
   else
-sibcall = gen_sibcall_internal (target, const0_rtx);
+sibcall
+  = gen_sibcall_internal (target, const0_rtx, gen_int_mode (cc, SImode));
 
   rtx_insn *before_call = PREV_INSN (call);
   remove_insn (call);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 73e9f2001e6..1ca3ed42d40 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -101,6 +101,9 @@ struct GTY(())  riscv_frame_info {
   /* Likewise FPR X.  */
   unsigned int fmask;
 
+  /* Likewise for vector registers.  */
+  unsigned int vmask;
+
   /* How much the GPR save/restore routines adjust sp (or 0 if unused).  */
   unsigned save_libcall_adjustment;
 
@@ -108,6 +111,10 @@ struct GTY(())  riscv_frame_info {
   poly_int64 gp_sp_offset;
   poly_int64 fp_sp_offset;
 
+  /* Top and bottom offsets of vector save areas from frame bottom.  */
+  poly_int64 v_sp_offset_top;
+  poly_int64 v_sp_offset_bottom;
+
   /* Offset of virtual frame pointer from stack pointer/frame bottom */
   poly_int64 frame_pointer_offset;
 
@@ -243,7 +250,7 @@ unsigned riscv_stack_boundary;
 /* If non-zero, this is an offset to be added to SP to redefine the CFA
when restoring the FP register from the stack.  Only valid when generating
the epilogue.  */
-static int epilogue_cfa_sp_offset;
+static poly_int64 epilogue_cfa_sp_offset;
 
 /* Which tuning parameters to use.  */
 static const struct riscv_tune_param *tune_param;

Re: [PATCH] cleanup: Change condition order

2023-07-21 Thread Lehua Ding
Commited, thanks Richard.


Bootstrap and regression passed.




-- Original --
From:   
 "Richard Biener"   
 
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625067.html
> 
> len mask stuff should be checked before mask.
> 
> So I reorder all condition order to check LEN MASK stuff before MASK.
> 
> This is the last clean up patch.
> 
> Boostrap and Regression is on the way.

OK.

> gcc/ChangeLog:
> 
>* tree-vect-stmts.cc (check_load_store_for_partial_vectors): Change 
condition order.
>(vectorizable_operation): Ditto.
> 
> ---
>  gcc/tree-vect-stmts.cc | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index d5b4f020332..2fe856db9ab 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -1635,17 +1635,17 @@ check_load_store_for_partial_vectors 
(loop_vec_info loop_vinfo, tree vectype,
>    internal_fn len_ifn = (is_load
>    ? 
IFN_MASK_LEN_GATHER_LOAD
>    : 
IFN_MASK_LEN_SCATTER_STORE);
> -  if (internal_gather_scatter_fn_supported_p 
(ifn, vectype,
> +  if (internal_gather_scatter_fn_supported_p 
(len_ifn, vectype,
>     
gs_info->memory_type,
>     
gs_info->offset_vectype,
>     
gs_info->scale))
> -  vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype,
> -     scalar_mask);
> -  else if 
(internal_gather_scatter_fn_supported_p (len_ifn, vectype,
> +  vect_record_loop_len (loop_vinfo, lens, nvectors, vectype, 1);
> +  else if 
(internal_gather_scatter_fn_supported_p (ifn, vectype,
>   
   gs_info->memory_type,
>   
   gs_info->offset_vectype,
>   
   gs_info->scale))
> -  vect_record_loop_len (loop_vinfo, lens, nvectors, vectype, 1);
> +  vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype,
> +     scalar_mask);
>    else
>   {
>     if (dump_enabled_p ())
> @@ -6596,16 +6596,16 @@ vectorizable_operation (vec_info *vinfo,
>     && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P 
(loop_vinfo)
>     && mask_out_inactive)
>   {
> -    if (cond_fn != IFN_LAST
> -    && 
direct_internal_fn_supported_p (cond_fn, vectype,
> +    if (cond_len_fn != IFN_LAST
> +    && 
direct_internal_fn_supported_p (cond_len_fn, vectype,
>    OPTIMIZE_FOR_SPEED))
> -      vect_record_loop_mask (loop_vinfo, masks, ncopies * 
vec_num,
> -     vectype, NULL);
> -    else if (cond_len_fn != IFN_LAST
> -     && direct_internal_fn_supported_p 
(cond_len_fn, vectype,
> -    
OPTIMIZE_FOR_SPEED))
>       vect_record_loop_len (loop_vinfo, lens, 
ncopies * vec_num, vectype,
>     1);
> +    else if (cond_fn != IFN_LAST
> +     && direct_internal_fn_supported_p 
(cond_fn, vectype,
> +    
OPTIMIZE_FOR_SPEED))
> +      vect_record_loop_mask (loop_vinfo, masks, ncopies * 
vec_num,
> +     vectype, NULL);
>     else
>       {
>     if (dump_enabled_p ())
> 

-- 
Richard Biener 

Re: [PATCH] cleanup: make all cond_len_* and mask_len_* consistent on the order of mask and len

2023-07-21 Thread Lehua Ding
Commited, thanks Richard.


Bootstrap and regression passed.





-- Original --
From:   
 "Richard Biener"   
 
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625121.html
> 
> Hi, Richard and Richi.
> 
> This patch is to align the order of mask and len.
> 
> Currently, According to this piece code:
>if (final_len && final_mask)
>  call = gimple_build_call_internal (
>    IFN_LEN_MASK_GATHER_LOAD, 7, 
dataref_ptr,
>    vec_offset, scale, zero, final_mask, 
final_len,
>    bias);
> 
> You can see the order of mask and len, is {mask,len,bias}.
> "mask" comes before "len". The reason of this order is that we want to
> reuse the current codes of MASK_GATHER_LOAD/MASK_SCATTER_STORE.
> 
> Same situation for COND_LEN_*, we want to reuse the codes of COND_*.
> 
> Reusing codes from the existing MASK_* or COND_* can allow us not to
> change the codes too much and make the codes elegant and easy to maintain 
&& read.
> 
> To avoid any confusions of auto-vectorization patterns that includes both 
mask and len,
> 
> this patch align the order of mask and len for both Gimple IR and RTL 
pattern into
> 
> {mask, len, bias} to make everything cleaner and more elegant.
> 
> Bootstrap and Regression is on the way.

OK.

> gcc/ChangeLog:
> 
>* config/riscv/autovec.md: Align order of mask and len.
>* config/riscv/riscv-v.cc (expand_load_store): Ditto.
>(expand_gather_scatter): Ditto.
>* doc/md.texi: Ditto.
>* internal-fn.cc (add_len_and_mask_args): Ditto.
>(add_mask_and_len_args): Ditto.
>(expand_partial_load_optab_fn): Ditto.
>(expand_partial_store_optab_fn): Ditto.
>(expand_scatter_store_optab_fn): Ditto.
>(expand_gather_load_optab_fn): Ditto.
>(internal_fn_len_index): Ditto.
>(internal_fn_mask_index): Ditto.
>(internal_len_load_store_bias): Ditto.
>* tree-vect-stmts.cc (vectorizable_store): Ditto.
>(vectorizable_load): Ditto.
> 
> ---
>  gcc/config/riscv/autovec.md | 96 
++---
>  gcc/config/riscv/riscv-v.cc | 12 ++---
>  
gcc/doc/md.texi
 | 36 +++---
>  
gcc/internal-fn.cc  | 50 
+--
>  gcc/tree-vect-stmts.cc  |  8 ++--
>  5 files changed, 101 insertions(+), 101 deletions(-)
> 
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 7eb96d42c18..d899922586a 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -25,9 +25,9 @@
>  (define_expand "mask_len_load

Re: [PATCH] mklog: Add --append option to auto add generate ChangeLog to patch file

2023-07-21 Thread Lehua Ding
Hi Martin,


> this patch caused flake8 to complain about contrib/mklog.py:
> 
> $ flake8 contrib/mklog.py
> contrib/mklog.py:377:80: E501 line too long (85 > 79 characters)
> contrib/mklog.py:388:26: E127 continuation line over-indented for 
visual indent
> contrib/mklog.py:388:36: W605 invalid escape sequence '\s'
> contrib/mklog.py:388:40: W605 invalid escape sequence '\s'
> contrib/mklog.py:388:44: W605 invalid escape sequence '\s'
> contrib/mklog.py:388:47: W605 invalid escape sequence '\|'
> contrib/mklog.py:388:49: W605 invalid escape sequence '\s'
> contrib/mklog.py:388:51: W605 invalid escape sequence '\d'
> contrib/mklog.py:388:54: W605 invalid escape sequence '\s'
> contrib/mklog.py:388:58: W605 invalid escape sequence '\-'
> 
> Can you please have a look and ideally fix the issues?


Thank you for pointing out this.
I will fix these format errors in another fix patch[1].
I tried to fix the following format error but couldn't
find a way, do you know how to fix this error?



contrib/mklog.py:388:26: E127 continuation line over-indented for visual indent


Best,
Lehua


[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624880.html

Re: [PATCH] mklog: Add --append option to auto add generate ChangeLog to patch file

2023-07-21 Thread Lehua Ding
> I am no python expert but the following seems to work:


Thank you so much, it works for me.


Lehua

Re: [PATCH] mklog: Add --append option to auto add generate ChangeLog to patch file

2023-07-21 Thread Lehua Ding
Hi Martin,


By the way, is there a standard format required for these Python files?
I see that other Python files have similar format error when checked
using flake8. If so, it feels necessary to configure a git hook on git 
server
to do this check.


Best,
Lehua

Re: [PATCH] mklog: Add --append option to auto add generate ChangeLog to patch file

2023-07-21 Thread Lehua Ding
Hi Martin,


Thank you for telling me about the Python code format specification.
I'm no idea how to add checks for pushed commits.
Anyway, first make sure I don't introduce new format errors myself.


Best,
Lehua

[PATCH V5] VECT: Support floating-point in-order reduction for length loop control

2023-07-22 Thread Lehua Ding
From: Ju-Zhe Zhong 

PS: Submitted on behalf of Juzhe Zhong

Hi, Richard and Richi.

This patch support floating-point in-order reduction for loop length control.

Consider this following case:

float foo (float *__restrict a, int n)
{
  float result = 1.0;
  for (int i = 0; i < n; i++)
   result += a[i];
  return result;
}

When compile with **NO** -ffast-math on ARM SVE, we will end up with:

loop_mask = WHILE_ULT
result = MASK_FOLD_LEFT_PLUS (...loop_mask...)

For RVV, we don't use length loop control instead of mask:

So, with this patch, we expect to see:

loop_len = SELECT_VL
result = MASK_LEN_FOLD_LEFT_PLUS (...loop_len...)

gcc/ChangeLog:

* tree-vect-loop.cc (get_masked_reduction_fn): Add 
mask_len_fold_left_plus.
(vectorize_fold_left_reduction): Ditto.
(vectorizable_reduction): Ditto.
(vect_transform_reduction): Ditto.

---
 gcc/tree-vect-loop.cc | 41 -
 1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index d036a7d4480..dba509b6f37 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -6800,11 +6800,13 @@ static internal_fn
 get_masked_reduction_fn (internal_fn reduc_fn, tree vectype_in)
 {
   internal_fn mask_reduc_fn;
+  internal_fn mask_len_reduc_fn;
 
   switch (reduc_fn)
 {
 case IFN_FOLD_LEFT_PLUS:
   mask_reduc_fn = IFN_MASK_FOLD_LEFT_PLUS;
+  mask_len_reduc_fn = IFN_MASK_LEN_FOLD_LEFT_PLUS;
   break;
 
 default:
@@ -6814,6 +6816,9 @@ get_masked_reduction_fn (internal_fn reduc_fn, tree 
vectype_in)
   if (direct_internal_fn_supported_p (mask_reduc_fn, vectype_in,
  OPTIMIZE_FOR_SPEED))
 return mask_reduc_fn;
+  if (direct_internal_fn_supported_p (mask_len_reduc_fn, vectype_in,
+ OPTIMIZE_FOR_SPEED))
+return mask_len_reduc_fn;
   return IFN_LAST;
 }
 
@@ -6834,7 +6839,8 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo,
   gimple *reduc_def_stmt,
   tree_code code, internal_fn reduc_fn,
   tree ops[3], tree vectype_in,
-  int reduc_index, vec_loop_masks *masks)
+  int reduc_index, vec_loop_masks *masks,
+  vec_loop_lens *lens)
 {
   class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   tree vectype_out = STMT_VINFO_VECTYPE (stmt_info);
@@ -6896,8 +6902,18 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo,
 {
   gimple *new_stmt;
   tree mask = NULL_TREE;
+  tree len = NULL_TREE;
+  tree bias = NULL_TREE;
   if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
mask = vect_get_loop_mask (loop_vinfo, gsi, masks, vec_num, vectype_in, 
i);
+  if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
+   {
+ len = vect_get_loop_len (loop_vinfo, gsi, lens, vec_num, vectype_in,
+  i, 1);
+ signed char biasval = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo);
+ bias = build_int_cst (intQI_type_node, biasval);
+ mask = build_minus_one_cst (truth_type_for (vectype_in));
+   }
 
   /* Handle MINUS by adding the negative.  */
   if (reduc_fn != IFN_LAST && code == MINUS_EXPR)
@@ -6917,7 +6933,10 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo,
 the preceding operation.  */
   if (reduc_fn != IFN_LAST || (mask && mask_reduc_fn != IFN_LAST))
{
- if (mask && mask_reduc_fn != IFN_LAST)
+ if (mask && len && mask_reduc_fn == IFN_MASK_LEN_FOLD_LEFT_PLUS)
+   new_stmt = gimple_build_call_internal (mask_reduc_fn, 5, reduc_var,
+  def0, mask, len, bias);
+ else if (mask && mask_reduc_fn == IFN_MASK_FOLD_LEFT_PLUS)
new_stmt = gimple_build_call_internal (mask_reduc_fn, 3, reduc_var,
   def0, mask);
  else
@@ -7979,6 +7998,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   else if (loop_vinfo && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
 {
   vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
+  vec_loop_lens *lens = &LOOP_VINFO_LENS (loop_vinfo);
   internal_fn cond_fn = get_conditional_internal_fn (op.code, op.type);
 
   if (reduction_type != FOLD_LEFT_REDUCTION
@@ -8006,8 +8026,17 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
}
   else
-   vect_record_loop_mask (loop_vinfo, masks, ncopies * vec_num,
-  vectype_in, NULL);
+   {
+ internal_fn mask_reduc_fn
+   = get_masked_reduction_fn (reduc_fn, vectype_in);
+
+ if (mask_reduc_fn == IFN_MASK_LEN_FOLD_LEFT_PLUS)
+   vect_record_loop_len (loop_vinfo, lens, ncopies * vec_num,

Re: Re: [PATCH V2] VECT: Support floating-point in-order reduction for length loop control

2023-07-22 Thread Lehua Ding
Hi Richard,


Bootstrap and regression are passed on X86 and
no new testcases fail on AArch64 with V5 patch:


https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625293.html


V5 patch is ok for trunk?


Best,
Lehua

Re:[PATCH V5] VECT: Support floating-point in-order reduction for length loop control

2023-07-24 Thread Lehua Ding
Commited V5 to the trunk, thanks Richard.


Best,
Lehua

Re: [PATCH V2] RISC-V: Support in-order floating-point reduction

2023-07-24 Thread Lehua Ding
Committed to the trunk, thanks Kito and Robin.

Best,
Lehua

Re: [PATCH] mklog: fix bugs of --append option

2023-07-25 Thread Lehua Ding
Hi,

Gentle Ping.

I sent a V2 patch as below for an additional fix Python code format error,
which Martin reported, thanks.

Best,
Lehua

contrib/ChangeLog:

* mklog.py: Fix bugs.
---
 contrib/mklog.py | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 26230b9b4f2..0abefcd9374 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -374,7 +374,8 @@ if __name__ == '__main__':
 args.fill_up_bug_titles, args.pr_numbers)
 if args.append:
 if (not args.input):
-raise Exception("`-a or --append` option not support standard 
input")
+raise Exception("`-a or --append` option not support standard "
+"input")
 lines = []
 with open(args.input, 'r', newline='\n') as f:
 # 1 -> not find the possible start of diff log
@@ -384,13 +385,14 @@ if __name__ == '__main__':
 for line in f:
 if maybe_diff_log == 1 and line == "---\n":
 maybe_diff_log = 2
-elif maybe_diff_log == 2 and \
- re.match("\s[^\s]+\s+\|\s\d+\s[+\-]+\n", line):
+elif (maybe_diff_log == 2 and
+  re.match(r"\s[^\s]+\s+\|\s+\d+\s[+\-]+\n", line)):
 lines += [output, "---\n", line]
 maybe_diff_log = 3
 else:
 # the possible start is not the true start.
 if maybe_diff_log == 2:
+lines.append("---\n")
 maybe_diff_log = 1
 lines.append(line)
 with open(args.input, "w") as f:
-- 
2.36.1



Re: PING^2 [PATCH] mklog: fix bugs of --append option

2023-08-03 Thread Lehua Ding
Gentle PING^2, thanks!

Re: [PATCH 2/3] RISC-V: Part-2: Save/Restore vector registers which need to be preversed

2023-08-07 Thread Lehua Ding
Hi Kito,

> > +machine_mode m1_mode = TARGET_VECTOR_ELEN_64
> > +? (TARGET_MIN_VLEN >= 128 ? VNx2DImode : VNx1DImode) 
> > +: VNx1SImode;

> This should update since JuZhe has update the mode system :P

Yes, thanks reminder.

> > @@ -5907,7 +6057,7 @@ riscv_expand_epilogue (int style) 
> > Start off by assuming that no registers need to be restored.*/ 
> >struct riscv_frame_info *frame = &cfun->machine->frame; 
> >unsigned mask = frame->mask; 
> > -HOST_WIDE_INT step2 = 0; 
> > +poly_int64 step2 = 0; 

> I saw we check `step2.to_constant () 
> 0` later, does it mean step2 is 
> always a scalar rather than a poly number? 
> If so, I would suggest keeping HOST_WIDE_INT if possible.
step2 will be reduced by `riscv_for_each_saved_v_reg (step2, riscv_restore_reg, 
false);`
before `step2.to_constant () > 0`. After `riscv_for_each_saved_v_reg`,
the step2 must be a constant. So step2 may be a poly number if there are any
length agnostic vector registers that need to be saved.

Best,
Lehua

Re: [PATCH] tree-optimization/110897 - Fix missed vectorization of shift on both RISC-V and aarch64

2023-08-07 Thread Lehua Ding
Committed to trunk, thanks Richard and Juzhe.

Re: [PATCH] RISC-V: Support VLS shift vectorization

2023-08-07 Thread Lehua Ding
Committed to the trunk, thanks Kito and Juzhe.

Re: [PATCH] RISC-V: Support neg VLS auto-vectorization

2023-08-07 Thread Lehua Ding
Committed to the trunk, thanks Kito and Juzhe.

[PATCH] RISC-V: Fix error combine of pred_mov pattern

2023-08-08 Thread Lehua Ding
Hi,

This patch fix PR110943 which will produce some error code. This is because
the error combine of some pred_mov pattern. Consider this code:

```
#include 

void foo9 (void *base, void *out, size_t vl)
{
int64_t scalar = *(int64_t*)(base + 100);
vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
*(vint64m2_t*)out = v;
}
```

RTL before combine pass:

```
(insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(const_vector:RVVM2DI repeat [
(const_int 0 [0])
])
(unspec:RVVM2DI [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di})
(insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t 
*)out_4(D)]+0 S[32, 32] A128])
(reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 
{*movrvvm2di_whole})
```

RTL after combine pass:
```
(insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 
S[32, 32] A128])
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(const_vector:RVVM2DI repeat [
(const_int 0 [0])
])
(unspec:RVVM2DI [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di})
```

This combine change the semantics of insn 14. I refine the conditon of @pred_mov
pattern to a more restrict. It's Ok for trunk?

Best,
Lehua


PR target/110943

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc 
(function_expander::function_expander):
  force_reg mem operand.
* config/riscv/vector.md: Refine condition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Update.
* gcc.target/riscv/rvv/base/pr110943.c: New test.

---
 gcc/config/riscv/riscv-vector-builtins.cc |  8 -
 gcc/config/riscv/vector.md|  5 +--
 .../gcc.target/riscv/rvv/base/pr110943.c  | 33 +++
 .../riscv/rvv/base/zvfhmin-intrinsic.c| 10 +++---
 4 files changed, 48 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 528dca7ae85..cd40fb2060f 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3471,7 +3471,13 @@ function_expander::function_expander (const 
function_instance &instance,
 exp (exp_in), target (target_in), opno (0)
 {
   if (!function_returns_void_p ())
-create_output_operand (&m_ops[opno++], target, TYPE_MODE (TREE_TYPE 
(exp)));
+{
+  if (target != NULL_RTX && MEM_P (target))
+   /* Use force_reg to prevent illegal mem-to-mem pattern on -O0.  */
+   target = force_reg (GET_MODE (target), target);
+  create_output_operand (&m_ops[opno++], target,
+TYPE_MODE (TREE_TYPE (exp)));
+}
 }
 
 /* Take argument ARGNO from EXP's argument list and convert it into
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index e56a2bf4bed..f0484b1162c 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1509,8 +1509,9 @@
  (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
   (match_operand:V_VLS 3 "vector_move_operand"   "m, m, m,
vr,vr,vr, viWc0, viWc0")
   (match_operand:V_VLS 2 "vector_merge_operand"  "0,vu,vu,
vu,vu, 0,vu, 0")))]
-  "TARGET_VECTOR && (MEM_P (operands[0]) || MEM_P (operands[3])
-   || CONST_VECTOR_P (operands[1]))"
+  "TARGET_VECTOR && ((register_operand (operands[0], mode) && MEM_P 
(operands[3])) ||
+ (MEM_P (operands[0]) && register_operand (operands[3], 
mode)) ||
+ (register_operand (operands[0], mode) && 
satisfies_constraint_Wc1 (operands[1])))"
   "@
vle.v\t%0,%3%p1
vle.v\t%0,%3
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c
new file mode 100644
index 000..8a6c00fc94d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c
@@ -0,0 +1,33 @@

Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern

2023-08-08 Thread Lehua Ding
Hi Jeff,


> The pattern's operand 0 explicitly allows MEMs as do the constraints.
> So forcing the operand into a register just seems like it's papering
> over the real problem.


The added of force_reg code is address the problem preduced after address the 
error combine.
The more restrict condtion of the pattern forbidden mem->mem pattern which 
will
produced in -O0. I think the implementation forgot to do this force_reg 
operation before
when doing the intrinis expansion The reason this problem isn't exposed before 
is because
the reload pass will converts mem->mem to mem->reg; reg->mem based on 
the constraint.


> I wonder if we should just remove the memory destination from this
> pattern.  Ultimately isn't that case just trying to optimize a 
constant
> store into memory -- perhaps we just need a distinct pattern for that.
> We generally try to avoid that for movXX patterns, but this seems a bit
> different.


The pattern like scalar mov pattern, need to block mem->mem case.
I think mem->reg, reg->mem, reg->reg patterns are defined in the
same insn is more readable, I wonder how you feel about that?
And there's another `*mov

[PATCH V2 0/3] RISC-V: Add an experimental vector calling convention

2023-08-10 Thread Lehua Ding
Hi RISC-V folks,

This patch implement the proposal of RISC-V vector calling convention[1] and
this feature can be enabled by `--param=riscv-vector-abi` option. Currently,
all vector type arguments and return values are pass by reference. With this
patch, these arguments and return values can pass through vector registers.
Currently only vector types defined in the RISC-V Vector Extension Intrinsic 
Document[2]
are supported. GNU-ext vector types are unsupported for now since the
corresponding proposal was not presented.

The proposal introduce a new calling convention variant, functions which follow
this variant need follow the bellow vector register convention.

| Name| ABI Mnemonic | Meaning  | Preserved across 
calls?
=
| v0  |  | Argument register| No
| v1-v7   |  | Callee-saved registers   | Yes
| v8-v23  |  | Argument registers   | No
| v24-v31 |  | Callee-saved registers   | Yes

If a functions follow this vector calling convention, then the function symbole
must be annotated with .variant_cc directive[3] (used to indicate that it is a
calling convention variant).

This implementation split into three parts, each part corresponds to a 
sub-patch.

- Part-1: Select suitable vector regsiters for vector type arguments and return
  values according to the proposal.
- Part-2: Allocate frame area for callee-saved vector registers and save/restore
  them in prologue and epilogue.
- Part-3: Generate .variant_cc directive for vector function in assembly code.

Best,
Lehua

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/389
[2] 
https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#type-system
[3] 
https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md#pseudo-ops

Lehua Ding (3):
  RISC-V: Part-1: Select suitable vector registers for vector type args
and returns
  RISC-V: Part-2: Save/Restore vector registers which need to be
preversed
  RISC-V: Part-3: Output .variant_cc directive for vector function

 gcc/config/riscv/riscv-protos.h   |   4 +
 gcc/config/riscv/riscv-sr.cc  |  12 +-
 gcc/config/riscv/riscv-vector-builtins.cc |  10 +
 gcc/config/riscv/riscv.cc | 505 --
 gcc/config/riscv/riscv.h  |  40 ++
 gcc/config/riscv/riscv.md |  43 +-
 gcc/config/riscv/riscv.opt|   5 +
 .../riscv/rvv/base/abi-call-args-1-run.c  | 127 +
 .../riscv/rvv/base/abi-call-args-1.c  | 197 +++
 .../riscv/rvv/base/abi-call-args-2-run.c  |  34 ++
 .../riscv/rvv/base/abi-call-args-2.c  |  27 +
 .../riscv/rvv/base/abi-call-args-3-run.c  | 260 +
 .../riscv/rvv/base/abi-call-args-3.c  | 116 
 .../riscv/rvv/base/abi-call-args-4-run.c  | 145 +
 .../riscv/rvv/base/abi-call-args-4.c  | 111 
 .../riscv/rvv/base/abi-call-error-1.c |  11 +
 .../riscv/rvv/base/abi-call-return-run.c  | 127 +
 .../riscv/rvv/base/abi-call-return.c  | 197 +++
 .../riscv/rvv/base/abi-call-variant_cc.c  |  39 ++
 .../rvv/base/abi-callee-saved-1-fixed-1.c |  85 +++
 .../rvv/base/abi-callee-saved-1-fixed-2.c |  85 +++
 .../riscv/rvv/base/abi-callee-saved-1.c   |  87 +++
 .../riscv/rvv/base/abi-callee-saved-2.c   | 117 
 23 files changed, 2322 insertions(+), 62 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-1-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-2-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-3-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-4-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-args-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-error-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-return-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-return.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-variant_cc.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-2.c

-- 
2.36.3



[PATCH V2 1/3] RISC-V: Part-1: Select suitable vector registers for vector type args and returns

2023-08-10 Thread Lehua Ding
I have posted below the vector register calling convention rules from in the
proposal[1]:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/389

gcc/ChangeLog:

* config/riscv/riscv-protos.h (builtin_type_p): New function for 
checking vector type.
* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
(riscv_init_cumulative_args): Setup variant_cc field.
(riscv_vector_type_p): New function for checking vector type.
(riscv_hard_regno_nregs): Hoist declare.
(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
(riscv_get_arg_info): Support vector cc.
(riscv_function_arg_advance): Update cum.
(riscv_pass_by_reference): Handle vector args.
(riscv_v_abi): New function return vector abi.
(riscv_return_value_is_vector_type_p): New function for check vector 
arguments.
(riscv_arguments_is_vector_type_p): New function for check vector 
returns.
(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
(V_ARG_FIRST): Ditto.
(V_ARG_LAST): Ditto.
(enum riscv_cc): Define all RISCV_CC variants.
* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

---
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-vector-builtins.cc |  10 +
 gcc/config/riscv/riscv.cc | 235 ++--
 gcc/config/riscv/riscv.h  |  25 ++
 gcc/config/riscv/riscv.opt|   5 +
 .../riscv/rvv/base/abi-call-args-1-run.c  | 127 +
 .../riscv/rvv/base/abi-call-args-1.c  | 197 +
 .../riscv/rvv/base/abi-call-args-2-run.c  |  34 +++
 .../riscv/rvv/base/abi-call-args-2.c  |  27 ++
 .../riscv/rvv/base/abi-call-args-3-run.c  | 260 ++
 .../riscv/rvv/base/abi-call-args-3.c  | 116 
 .../riscv/rvv/base/abi-call-a

[PATCH V2 2/3] RISC-V: Part-2: Save/Restore vector registers which need to be preversed

2023-08-10 Thread Lehua Ding
Because functions which follow vector calling convention variant has
callee-saved vector reigsters but functions which follow standard calling
convention don't have. We need to distinguish which function callee is so that
we can tell GCC exactly which vector registers callee will clobber. So I encode
the callee's calling convention information into the calls rtx pattern like
AArch64. The old operand 2 and 3 of call pattern which copy from MIPS target are
useless and removed according to my analysis.

gcc/ChangeLog:

* config/riscv/riscv-sr.cc (riscv_remove_unneeded_save_restore_calls): 
Pass riscv_cc.
* config/riscv/riscv.cc (struct riscv_frame_info): Add new fileds.
(riscv_frame_info::reset): Reset new fileds.
(riscv_call_tls_get_addr): Pass riscv_cc.
(riscv_function_arg): Return riscv_cc for call patterm.
(riscv_insn_callee_abi): Implement TARGET_INSN_CALLEE_ABI.
(riscv_save_reg_p): Add vector callee-saved check.
(riscv_save_libcall_count): Add vector save area.
(riscv_compute_frame_info): Ditto.
(riscv_restore_reg): Update for type change.
(riscv_for_each_saved_v_reg): New function save vector registers.
(riscv_first_stack_step): Handle funciton with vector callee-saved 
registers.
(riscv_expand_prologue): Ditto.
(riscv_expand_epilogue): Ditto.
(riscv_output_mi_thunk): Pass riscv_cc.
(TARGET_INSN_CALLEE_ABI): Implement TARGET_INSN_CALLEE_ABI.
* config/riscv/riscv.md: Add CALLEE_CC operand for call pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-1.c: New test.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-2.c: New test.
* gcc.target/riscv/rvv/base/abi-callee-saved-1.c: New test.
* gcc.target/riscv/rvv/base/abi-callee-saved-2.c: New test.
---
 gcc/config/riscv/riscv-sr.cc  |  12 +-
 gcc/config/riscv/riscv.cc | 222 +++---
 gcc/config/riscv/riscv.md |  43 +++-
 .../rvv/base/abi-callee-saved-1-fixed-1.c |  85 +++
 .../rvv/base/abi-callee-saved-1-fixed-2.c |  85 +++
 .../riscv/rvv/base/abi-callee-saved-1.c   |  87 +++
 .../riscv/rvv/base/abi-callee-saved-2.c   | 117 +
 7 files changed, 606 insertions(+), 45 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-callee-saved-2.c

diff --git a/gcc/config/riscv/riscv-sr.cc b/gcc/config/riscv/riscv-sr.cc
index 7248f04d68f..e6e17685df5 100644
--- a/gcc/config/riscv/riscv-sr.cc
+++ b/gcc/config/riscv/riscv-sr.cc
@@ -447,12 +447,18 @@ riscv_remove_unneeded_save_restore_calls (void)
   && !SIBCALL_REG_P (REGNO (target)))
 return;
 
+  /* Extract RISCV CC from the UNSPEC rtx.  */
+  rtx unspec = XVECEXP (callpat, 0, 1);
+  gcc_assert (GET_CODE (unspec) == UNSPEC
+ && XINT (unspec, 1) == UNSPEC_CALLEE_CC);
+  riscv_cc cc = (riscv_cc) INTVAL (XVECEXP (unspec, 0, 0));
   rtx sibcall = NULL;
   if (set_target != NULL)
-sibcall
-  = gen_sibcall_value_internal (set_target, target, const0_rtx);
+sibcall = gen_sibcall_value_internal (set_target, target, const0_rtx,
+ gen_int_mode (cc, SImode));
   else
-sibcall = gen_sibcall_internal (target, const0_rtx);
+sibcall
+  = gen_sibcall_internal (target, const0_rtx, gen_int_mode (cc, SImode));
 
   rtx_insn *before_call = PREV_INSN (call);
   remove_insn (call);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index aa6b46d7611..09c9e09e83a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -108,6 +108,9 @@ struct GTY(())  riscv_frame_info {
   /* Likewise FPR X.  */
   unsigned int fmask;
 
+  /* Likewise for vector registers.  */
+  unsigned int vmask;
+
   /* How much the GPR save/restore routines adjust sp (or 0 if unused).  */
   unsigned save_libcall_adjustment;
 
@@ -115,6 +118,10 @@ struct GTY(())  riscv_frame_info {
   poly_int64 gp_sp_offset;
   poly_int64 fp_sp_offset;
 
+  /* Top and bottom offsets of vector save areas from frame bottom.  */
+  poly_int64 v_sp_offset_top;
+  poly_int64 v_sp_offset_bottom;
+
   /* Offset of virtual frame pointer from stack pointer/frame bottom */
   poly_int64 frame_pointer_offset;
 
@@ -265,7 +272,7 @@ unsigned riscv_stack_boundary;
 /* If non-zero, this is an offset to be added to SP to redefine the CFA
when restoring the FP register from the stack.  Only valid when generating
the epilogue.  */
-static int epilogue_cfa_sp_offset;
+static poly_int64 epilogue_cfa_sp_offset;
 
 /* Which tuning parameters to use.  */
 static const struct riscv_tune_param *tune_param;

[PATCH V2 3/3] RISC-V: Part-3: Output .variant_cc directive for vector function

2023-08-10 Thread Lehua Ding
Functions which follow vector calling convention variant need be annotated by
.variant_cc directive according the RISC-V Assembly Programmer's Manual and
RISC-V ELF Specification[2].

[1] 
https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md#pseudo-ops
[2] 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc#dynamic-linking

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_declare_function_name): Add protos.
(riscv_asm_output_alias): Ditto.
(riscv_asm_output_external): Ditto.
* config/riscv/riscv.cc (riscv_asm_output_variant_cc):  Output 
.variant_cc directive for vector function.
(riscv_declare_function_name): Ditto.
(riscv_asm_output_alias): Ditto.
(riscv_asm_output_external): Ditto.
* config/riscv/riscv.h (ASM_DECLARE_FUNCTION_NAME): Implement 
ASM_DECLARE_FUNCTION_NAME.
(ASM_OUTPUT_DEF_FROM_DECLS): Implement ASM_OUTPUT_DEF_FROM_DECLS.
(ASM_OUTPUT_EXTERNAL): Implement ASM_OUTPUT_EXTERNAL.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-call-variant_cc.c: New test.
---
 gcc/config/riscv/riscv-protos.h   |  3 ++
 gcc/config/riscv/riscv.cc | 48 +++
 gcc/config/riscv/riscv.h  | 15 ++
 .../riscv/rvv/base/abi-call-variant_cc.c  | 39 +++
 4 files changed, 105 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/abi-call-variant_cc.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 4a67297173d..260ba2f9a49 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -101,6 +101,9 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
 extern void riscv_split_doubleword_move (rtx, rtx);
 extern const char *riscv_output_move (rtx, rtx);
 extern const char *riscv_output_return ();
+extern void riscv_declare_function_name (FILE *, const char *, tree);
+extern void riscv_asm_output_alias (FILE *, const tree, const tree);
+extern void riscv_asm_output_external (FILE *, const tree, const char *);
 
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool 
*invert_ptr = 0);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 09c9e09e83a..83cf7a5da82 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -7013,6 +7013,54 @@ riscv_issue_rate (void)
   return tune_param->issue_rate;
 }
 
+/* Output .variant_cc for function symbol which follows vector calling
+   convention.  */
+
+static void
+riscv_asm_output_variant_cc (FILE *stream, const tree decl, const char *name)
+{
+  if (TREE_CODE (decl) == FUNCTION_DECL)
+{
+  riscv_cc cc = (riscv_cc) fndecl_abi (decl).id ();
+  if (cc == RISCV_CC_V)
+   {
+ fprintf (stream, "\t.variant_cc\t");
+ assemble_name (stream, name);
+ fprintf (stream, "\n");
+   }
+}
+}
+
+/* Implement ASM_DECLARE_FUNCTION_NAME.  */
+
+void
+riscv_declare_function_name (FILE *stream, const char *name, tree fndecl)
+{
+  riscv_asm_output_variant_cc (stream, fndecl, name);
+  ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
+  ASM_OUTPUT_LABEL (stream, name);
+}
+
+/* Implement ASM_OUTPUT_DEF_FROM_DECLS.  */
+
+void
+riscv_asm_output_alias (FILE *stream, const tree decl, const tree target)
+{
+  const char *name = XSTR (XEXP (DECL_RTL (decl), 0), 0);
+  const char *value = IDENTIFIER_POINTER (target);
+  riscv_asm_output_variant_cc (stream, decl, name);
+  ASM_OUTPUT_DEF (stream, name, value);
+}
+
+/* Implement ASM_OUTPUT_EXTERNAL.  */
+
+void
+riscv_asm_output_external (FILE *stream, tree decl, const char *name)
+{
+  default_elf_asm_output_external (stream, decl, name);
+  riscv_asm_output_variant_cc (stream, decl, name);
+}
+
 /* Auxiliary function to emit RISC-V ELF attribute. */
 static void
 riscv_emit_attribute ()
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index b24b240dd75..1820593bab5 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -1021,6 +1021,21 @@ while (0)
 
 #define ASM_COMMENT_START "#"
 
+/* Add output .variant_cc directive for specific function definition.  */
+#undef ASM_DECLARE_FUNCTION_NAME
+#define ASM_DECLARE_FUNCTION_NAME(STR, NAME, DECL) 
\
+  riscv_declare_function_name (STR, NAME, DECL)
+
+/* Add output .variant_cc directive for specific alias definition.  */
+#undef ASM_OUTPUT_DEF_FROM_DECLS
+#define ASM_OUTPUT_DEF_FROM_DECLS(STR, DECL, TARGET)   
\
+  riscv_asm_output_alias (STR, DECL, TARGET)
+
+/* Add output .variant_cc directive for specific extern function.  */
+#undef ASM_OUTPUT_EXTERNAL
+#define ASM_OUTPUT_EXTERNAL(STR, DECL, NAME)   
\
+  riscv_asm_output_external (STR, DECL, NAME)
+
 #undef SIZE_TYPE
 #define SIZE_TYPE (POINTER_SIZE == 64 ? "long unsigned int" : "unsigned int")
 
diff --git a/gcc/testsu

Re: [PATCH 1/3] RISC-V: Part-1: Select suitable vector registers for vector type args and returns

2023-08-10 Thread Lehua Ding
Thanks so much for Kito's online and offline comments.
I have upload V2 patchs which address all comments.


https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626935.html


Best,
Lehua




-- Original --
From:   
 "Kito Cheng"   
 


Re: [PATCH V2 0/3] RISC-V: Add an experimental vector calling convention

2023-08-10 Thread Lehua Ding
Hi Richard,


Thanks review.


> Just to mention at some point you want to think about the OpenMP SIMD ABI 
which
> includes a mangling scheme but would also open up to have different
> calling conventions.> So please keep that usage case in mind, possibly 
allowing the vector
> calling convention
> to be required for this.  


Thanks remainder. A new function attribute `riscv_vector_cc` will be 
introduced
later to specify that a function adheres to the vector calling convention.

> Also note there's 'inbranch' variants which
> require passing
> a mask - your table above doesn't list any mask registers (in case
> those exist in RISC-V).


Separate mask registers do not exist in RISC-V;
mask arguments share vector registers.

Best,
Lehua

[PATCH V2] RISC-V: Fix error combine of pred_mov pattern

2023-08-10 Thread Lehua Ding
Hi,

This patch fix PR110943 which will produce some error code. This is because
the error combine of some pred_mov pattern. Consider this code:

```

void foo9 (void *base, void *out, size_t vl)
{
int64_t scalar = *(int64_t*)(base + 100);
vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
*(vint64m2_t*)out = v;
}
```

RTL before combine pass:

```
(insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(const_vector:RVVM2DI repeat [
(const_int 0 [0])
])
(unspec:RVVM2DI [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di})
(insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t 
*)out_4(D)]+0 S[32, 32] A128])
(reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 
{*movrvvm2di_whole})
```

RTL after combine pass:
```
(insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 
S[32, 32] A128])
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(const_vector:RVVM2DI repeat [
(const_int 0 [0])
])
(unspec:RVVM2DI [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di})
```

This combine change the semantics of insn 14. I refine the conditon of @pred_mov
pattern to a more restrict. It's Ok for trunk?

Best,
Lehua

PR target/110943

gcc/ChangeLog:

* config/riscv/predicates.md (vector_const_int_or_double_0_operand):
  New.
* config/riscv/riscv-vector-builtins.cc 
(function_expander::function_expander):
  force_reg mem operand.
* config/riscv/vector.md (@pred_mov): Wrapper.
(*pred_mov): Remove imm -> reg pattern.
(*pred_broadcast_imm): Add imm -> reg pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Update.
* gcc.target/riscv/rvv/base/pr110943.c: New test.

---
 gcc/config/riscv/predicates.md|  5 +
 gcc/config/riscv/riscv-vector-builtins.cc |  8 +-
 gcc/config/riscv/vector.md| 97 +++
 .../gcc.target/riscv/rvv/base/pr110943.c  | 33 +++
 .../riscv/rvv/base/zvfhmin-intrinsic.c| 10 +-
 5 files changed, 104 insertions(+), 49 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 9db28c2def7..f2e406c718a 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -295,6 +295,11 @@
   (ior (match_operand 0 "register_operand")
(match_operand 0 "const_int_operand")))
 
+(define_predicate "vector_const_int_or_double_0_operand"
+  (and (match_code "const_vector")
+   (match_test "satisfies_constraint_vi (op)
+|| satisfies_constraint_Wc0 (op)")))
+
 (define_predicate "vector_move_operand"
   (ior (match_operand 0 "nonimmediate_operand")
(and (match_code "const_vector")
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index abab06c00ed..2da542585a8 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3471,7 +3471,13 @@ function_expander::function_expander (const 
function_instance &instance,
 exp (exp_in), target (target_in), opno (0)
 {
   if (!function_returns_void_p ())
-create_output_operand (&m_ops[opno++], target, TYPE_MODE (TREE_TYPE 
(exp)));
+{
+  if (target != NULL_RTX && MEM_P (target))
+   /* Use force_reg to prevent illegal mem-to-mem pattern on -O0.  */
+   target = force_reg (GET_MODE (target), target);
+  create_output_operand (&m_ops[opno++], target,
+TYPE_MODE (TREE_TYPE (exp)));
+}
 }
 
 /* Take argument ARGNO from EXP's argument list and convert it into
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index cf37b472930..508a3074080 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1446,69 +1446,60 @@
 ;; - 15.1 Vector Mask-Register Logical Instructions
 ;; 
--

Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern

2023-08-10 Thread Lehua Ding
Hi Jeff,

After reconsidering I think the split of pattern you mention
makes sense to me. I have split the `@pred_movhttps://gcc.gnu.org/pipermail/gcc-patches/2023-August/626981.html


Best,
Lehua

[PATCH] RISC-V: Revert the convert from vmv.s.x to vmv.v.i

2023-08-11 Thread Lehua Ding
Hi,

This patch revert the convert from vmv.s.x to vmv.v.i and add new pattern
optimize the special case when the scalar operand is zero.

Currently, the broadcast pattern where the scalar operand is a imm
will be converted to vmv.v.i from vmv.s.x and the mask operand will be
converted from 00..01 to 11..11. There are some advantages and
disadvantages before and after the conversion after discussing
with Juzhe offline and we chose not to do this transform.

Before:

  Advantages: The vsetvli info required by vmv.s.x has better compatibility 
since
  vmv.s.x only required SEW and VLEN be zero or one. That mean there
  is more opportunities to combine with other vsetlv infos in vsetvl pass.

  Disadvantages: For non-zero scalar imm, one more `li rd, imm` instruction
  will be needed.

After:

  Advantages: No need `li rd, imm` instruction since vmv.v.i support imm 
operand.

  Disadvantages: Like before's advantages. Worse compatibility leads to more
  vsetvl instrunctions need.

Consider the bellow C code and asm after autovec.
there is an extra insn (vsetivli zero, 1, e32, m1, ta, ma)
after converted vmv.s.x to vmv.v.i.

```
int foo1(int* restrict a, int* restrict b, int *restrict c, int n) {
int sum = 0;
for (int i = 0; i < n; i++)
  sum += a[i] * b[i];

return sum;
}
```

asm (Before):

```
foo1:
ble a3,zero,.L7
vsetvli a2,zero,e32,m1,ta,ma
vmv.v.i v1,0
.L6:
vsetvli a5,a3,e32,m1,tu,ma
sllia4,a5,2
sub a3,a3,a5
vle32.v v2,0(a0)
vle32.v v3,0(a1)
add a0,a0,a4
add a1,a1,a4
vmacc.vvv1,v3,v2
bne a3,zero,.L6
vsetvli a2,zero,e32,m1,ta,ma
vmv.s.x v2,zero
vredsum.vs  v1,v1,v2
vmv.x.s a0,v1
ret
.L7:
li  a0,0
ret
```

asm (After):

```
foo1:
ble a3,zero,.L4
vsetvli a2,zero,e32,m1,ta,ma
vmv.v.i v1,0
.L3:
vsetvli a5,a3,e32,m1,tu,ma
sllia4,a5,2
sub a3,a3,a5
vle32.v v2,0(a0)
vle32.v v3,0(a1)
add a0,a0,a4
add a1,a1,a4
vmacc.vvv1,v3,v2
bne a3,zero,.L3
vsetivlizero,1,e32,m1,ta,ma
vmv.v.i v2,0
vsetvli a2,zero,e32,m1,ta,ma
vredsum.vs  v1,v1,v2
vmv.x.s a0,v1
ret
.L4:
li  a0,0
ret
```

Best,
Lehua

Co-Authored-By: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/predicates.md (vector_const_0_operand): New.
* config/riscv/vector.md (*pred_broadcast_zero): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalar_move-5.c: Update.
* gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto.

---
 gcc/config/riscv/predicates.md|  4 ++
 gcc/config/riscv/vector.md| 43 +--
 .../gcc.target/riscv/rvv/base/scalar_move-5.c | 20 +++--
 .../gcc.target/riscv/rvv/base/scalar_move-6.c | 22 --
 4 files changed, 70 insertions(+), 19 deletions(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index f2e406c718a..c102489d979 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -300,6 +300,10 @@
(match_test "satisfies_constraint_vi (op)
 || satisfies_constraint_Wc0 (op)")))
 
+(define_predicate "vector_const_0_operand"
+  (and (match_code "const_vector")
+   (match_test "satisfies_constraint_Wc0 (op)")))
+
 (define_predicate "vector_move_operand"
   (ior (match_operand 0 "nonimmediate_operand")
(and (match_code "const_vector")
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 508a3074080..4d98ab6f7e8 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1719,23 +1719,24 @@
  (match_operand:V_VLS 2 "vector_merge_operand")))]
   "TARGET_VECTOR"
 {
-  /* Handle vmv.s.x instruction which has memory scalar.  */
-  if (satisfies_constraint_Wdm (operands[3]) || riscv_vector::simm5_p 
(operands[3])
-  || rtx_equal_p (operands[3], CONST0_RTX (mode)))
+  /* Handle vmv.s.x instruction (Wb1 mask) which has memory scalar.  */
+  if (satisfies_constraint_Wdm (operands[3]))
 {
   if (satisfies_constraint_Wb1 (operands[1]))
-{
-  // Case 1: vmv.s.x (TA) ==> vlse.v (TA)
-  if (satisfies_constraint_vu (operands[2]))
-operands[1] = CONSTM1_RTX (mode);
-  else if (GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (Pmode))
-{
- // Case 2: vmv.s.x (TU) ==> andi vl + vlse.v (TU) in RV32 system.
+   {
+ /* Case 1: vmv.s.x (TA, x == memory) ==> vlse.v (TA)  */
+ if (satisfies_constraint_vu (operands[2]))
+   operands[1] = CONSTM1_RTX (mode);
+ else if (GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (Pmode))
+   {
+ /* Case 2: vmv.s.x (TU, x == memory) ==>
+  vl = 0 o

  1   2   3   4   >