Hi, Richard. I have rebase to trunk and send the updated patch for "decrement IV support": https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619115.html
Thanks. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-05-22 16:00 To: juzhe.zhong CC: gcc-patches; rguenther; pan2.li Subject: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong <juzhe.zh...@rivai.ai> > > Address comments from Richard that splits the patch of fixing multiple-rgroup > handling of length counting elements. > > This patch is fixing issue of handling multiple-rgroup of length is counting > elements > > Before this patch, multiple rgroup run fail: > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > > After this patch, These tests are all passed. Thanks, looks great. A couple of minor comments below: > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 905145ae97b..a13d6f5e898 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, > vec_loop_lens *lens, > rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS. */ > The new parameters need to be documented. How about: /* Given a complete set of lengths LENS, extract length number INDEX for an rgroup that operates on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS. Return a value that contains FACTOR multipled by the number of elements that should be processed. Insert any set-up statements before GSI. */ > tree > -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens, > - unsigned int nvectors, unsigned int index) > +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi, > + vec_loop_lens *lens, unsigned int nvectors, tree vectype, > + unsigned int index, unsigned int factor) > { > rgroup_controls *rgl = &(*lens)[nvectors - 1]; > bool use_bias_adjusted_len = > @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, > vec_loop_lens *lens, > > if (use_bias_adjusted_len) > return rgl->bias_adjusted_ctrl; > + else if (rgl->factor == 1 && factor == 1) > + { > + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); > + tree loop_len = rgl->controls[index]; > + poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type); > + poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype); > + if (maybe_ne (nunits1, nunits2)) > + { > + /* A loop len for data type X can be reused for data type Y > + if X has N times more elements than Y and if Y's elements > + are N times bigger than X's. */ > + gcc_assert (multiple_p (nunits1, nunits2)); > + factor = exact_div (nunits1, nunits2).to_constant (); > + gimple_seq seq = NULL; > + loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len, > + build_int_cst (iv_type, factor)); > + if (seq) > + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); > + } > + return loop_len; > + } > else > return rgl->controls[index]; This looks right, but I think it'd be clearer to rearrange things slightly: if (use_bias_adjusted_len) return rgl->bias_adjusted_ctrl; tree loop_len = rgl->controls[index]; if (rgl->factor == 1 && factor == 1) { poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type); poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype); if (maybe_ne (nunits1, nunits2)) { /* A loop len for data type X can be reused for data type Y if X has N times more elements than Y and if Y's elements are N times bigger than X's. */ gcc_assert (multiple_p (nunits1, nunits2)); factor = exact_div (nunits1, nunits2).to_constant (); tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); gimple_seq seq = NULL; loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len, build_int_cst (iv_type, factor)); if (seq) gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); } } return loop_len; There's no change to the individual statements here, just the control flow. This shares the default return statement and makes it clearer that no special handling is needed when the number of elements are equal. It also only computes iv_type when needed. The patch is OK for trunk with those changes, thanks. Once it's pushed, could you post the updated decrementing IV patch? Richard