On Tue, Jul 23, 2024 at 1:03 AM Edwin Lu <e...@rivosinc.com> wrote:
>
> Hi Richard,
>
> On 5/31/2024 1:48 AM, Richard Biener wrote:
> > On Thu, May 30, 2024 at 2:11 AM Patrick O'Neill <patr...@rivosinc.com> 
> > wrote:
> >>
> >> From: Greg McGary <g...@rivosinc.com>
> >
> > Still a NACK.  If remain ends up zero then
> >
> >                      /* Try to use a single smaller load when we are about
> >                         to load excess elements compared to the unrolled
> >                         scalar loop.  */
> >                      if (known_gt ((vec_num * j + i + 1) * nunits,
> >                                         (group_size * vf - gap)))
> >                        {
> >                          poly_uint64 remain = ((group_size * vf - gap)
> >                                                - (vec_num * j + i) * 
> > nunits);
> >                          if (known_ge ((vec_num * j + i + 1) * nunits
> >                                        - (group_size * vf - gap), nunits))
> >                            /* DR will be unused.  */
> >                            ltype = NULL_TREE;
> >
> > needs to be re-formulated so that the combined conditions make sure
> > this doesn't happen.  The outer known_gt should already ensure that
> > remain > 0.  For correctness that should possibly be maybe_gt though.
> >

Putting the list back in the loop and CCing Richard S.

> I'm currently looking into this patch and am trying to figure out what
> is going on. Stepping through gdb, I see that remain == {coeffs = {0,
> 2}} and nunits == {coeffs = {2, 2}} (the outer known_gt returned true
> with known_gt({coeffs = {8, 8}}, {coeffs = {6, 8}})).
>
>  From what I understand, this falls under the umbrella of 0 <= remain <
> nunits. The divide by zero error is because of the 0 <= remain which is
> coming from the constant_multiple_p function in poly-int.h where it
> performs the modulus NCa(a.coeffs[0]) % NCb(b.coeffs[0]).
> (https://github.com/gcc-mirror/gcc/blob/master/gcc/poly-int.h#L1970-L1971)
>
>
>  >                          if (known_ge ((vec_num * j + i + 1) * nunits
>  >                                        - (group_size * vf - gap),
> nunits))
>  >                            /* DR will be unused.  */
>  >                            ltype = NULL_TREE;
>
> This if condition is a bit suspicious to me though. I'm seeing that it's
> evaluating known_ge({coeffs = {2, 0}}, {coeffs = {2, 2}}) which is
> returning false. Should it be maybe_ge instead?

No, we can only not emit a load if we know it won't be used, not if
it eventually cannot be used.

> After running some
> tests, to me it looks like it doesn't vectorize quite as often; however,
> I'm not fully sure what else to do when the coeffs can potentially be
> equal to 0.
>
> Should it even be possible for there to be a {coeffs = {0, n}}
> situation? My understanding of how poly_ints are used for representing
> vectorization is that the first coefficient is the number of elements
> needed to make the minimum supported vector size. That is, if vector
> lengths are 128 bits, element size is 32 bits, coeff[0] should be
> minimum of 4. Is this understanding correct?

I was told n can be negative, but nunits.coeff[0] should be non-zero.

What is j and i when the divisor is zero?


>
> >> gcc/ChangeLog:
> >>          * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent 
> >> divide-by-zero.
> >>          * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove 
> >> dg-ice.
> >> ---
> >> No changes in v3. Depends on the risc-v backend option added in patch 1 to
> >> trigger the ICE.
> >> ---
> >>   gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c | 1 -
> >>   gcc/tree-vect-stmts.cc                                  | 3 ++-
> >>   2 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c 
> >> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c
> >> index dfbe09f01a1..79d03612a22 100644
> >> --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c
> >> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c
> >> @@ -1,6 +1,5 @@
> >>   /* { dg-do compile } */
> >>   /* { dg-options "-march=rv64gcv -mabi=lp64d -mrvv-vector-bits=scalable 
> >> -O3 -mno-autovec-segment" } */
> >> -/* { dg-ice "Floating point exception" } */
> >>
> >>   enum e { c, d };
> >>   enum g { f };
> >> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> >> index 4219ad832db..34f5736ba00 100644
> >> --- a/gcc/tree-vect-stmts.cc
> >> +++ b/gcc/tree-vect-stmts.cc
> >> @@ -11558,7 +11558,8 @@ vectorizable_load (vec_info *vinfo,
> >>                                   - (vec_num * j + i) * nunits);
> >>                              /* remain should now be > 0 and < nunits.  */
> >>                              unsigned num;
> >> -                           if (constant_multiple_p (nunits, remain, &num))
> >> +                           if (known_gt (remain, 0)
> >> +                               && constant_multiple_p (nunits, remain, 
> >> &num))
> >>                                {
> >>                                  tree ptype;
> >>                                  new_vtype
> >> --
> >> 2.43.2
> >>
> >
> Edwin
>

Reply via email to