On Tue, Jul 23, 2024 at 1:03 AM Edwin Lu <e...@rivosinc.com> wrote: > > Hi Richard, > > On 5/31/2024 1:48 AM, Richard Biener wrote: > > On Thu, May 30, 2024 at 2:11 AM Patrick O'Neill <patr...@rivosinc.com> > > wrote: > >> > >> From: Greg McGary <g...@rivosinc.com> > > > > Still a NACK. If remain ends up zero then > > > > /* Try to use a single smaller load when we are about > > to load excess elements compared to the unrolled > > scalar loop. */ > > if (known_gt ((vec_num * j + i + 1) * nunits, > > (group_size * vf - gap))) > > { > > poly_uint64 remain = ((group_size * vf - gap) > > - (vec_num * j + i) * > > nunits); > > if (known_ge ((vec_num * j + i + 1) * nunits > > - (group_size * vf - gap), nunits)) > > /* DR will be unused. */ > > ltype = NULL_TREE; > > > > needs to be re-formulated so that the combined conditions make sure > > this doesn't happen. The outer known_gt should already ensure that > > remain > 0. For correctness that should possibly be maybe_gt though. > >
Putting the list back in the loop and CCing Richard S. > I'm currently looking into this patch and am trying to figure out what > is going on. Stepping through gdb, I see that remain == {coeffs = {0, > 2}} and nunits == {coeffs = {2, 2}} (the outer known_gt returned true > with known_gt({coeffs = {8, 8}}, {coeffs = {6, 8}})). > > From what I understand, this falls under the umbrella of 0 <= remain < > nunits. The divide by zero error is because of the 0 <= remain which is > coming from the constant_multiple_p function in poly-int.h where it > performs the modulus NCa(a.coeffs[0]) % NCb(b.coeffs[0]). > (https://github.com/gcc-mirror/gcc/blob/master/gcc/poly-int.h#L1970-L1971) > > > > if (known_ge ((vec_num * j + i + 1) * nunits > > - (group_size * vf - gap), > nunits)) > > /* DR will be unused. */ > > ltype = NULL_TREE; > > This if condition is a bit suspicious to me though. I'm seeing that it's > evaluating known_ge({coeffs = {2, 0}}, {coeffs = {2, 2}}) which is > returning false. Should it be maybe_ge instead? No, we can only not emit a load if we know it won't be used, not if it eventually cannot be used. > After running some > tests, to me it looks like it doesn't vectorize quite as often; however, > I'm not fully sure what else to do when the coeffs can potentially be > equal to 0. > > Should it even be possible for there to be a {coeffs = {0, n}} > situation? My understanding of how poly_ints are used for representing > vectorization is that the first coefficient is the number of elements > needed to make the minimum supported vector size. That is, if vector > lengths are 128 bits, element size is 32 bits, coeff[0] should be > minimum of 4. Is this understanding correct? I was told n can be negative, but nunits.coeff[0] should be non-zero. What is j and i when the divisor is zero? > > >> gcc/ChangeLog: > >> * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent > >> divide-by-zero. > >> * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove > >> dg-ice. > >> --- > >> No changes in v3. Depends on the risc-v backend option added in patch 1 to > >> trigger the ICE. > >> --- > >> gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c | 1 - > >> gcc/tree-vect-stmts.cc | 3 ++- > >> 2 files changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c > >> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c > >> index dfbe09f01a1..79d03612a22 100644 > >> --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c > >> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c > >> @@ -1,6 +1,5 @@ > >> /* { dg-do compile } */ > >> /* { dg-options "-march=rv64gcv -mabi=lp64d -mrvv-vector-bits=scalable > >> -O3 -mno-autovec-segment" } */ > >> -/* { dg-ice "Floating point exception" } */ > >> > >> enum e { c, d }; > >> enum g { f }; > >> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > >> index 4219ad832db..34f5736ba00 100644 > >> --- a/gcc/tree-vect-stmts.cc > >> +++ b/gcc/tree-vect-stmts.cc > >> @@ -11558,7 +11558,8 @@ vectorizable_load (vec_info *vinfo, > >> - (vec_num * j + i) * nunits); > >> /* remain should now be > 0 and < nunits. */ > >> unsigned num; > >> - if (constant_multiple_p (nunits, remain, &num)) > >> + if (known_gt (remain, 0) > >> + && constant_multiple_p (nunits, remain, > >> &num)) > >> { > >> tree ptype; > >> new_vtype > >> -- > >> 2.43.2 > >> > > > Edwin >