Hi,

> -----Original Message-----
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: Tuesday, June 30, 2020 10:50 PM
> To: Richard Sandiford <richard.sandif...@arm.com>
> Cc: Richard Biener <richard.guent...@gmail.com>; Yangfei (Felix)
> <felix.y...@huawei.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182
> 
> On June 30, 2020 4:23:03 PM GMT+02:00, Richard Sandiford
> <richard.sandif...@arm.com> wrote:
> >Richard Biener <rguent...@suse.de> writes:
> >> On Tue, 30 Jun 2020, Richard Sandiford wrote:
> >>
> >>> Richard Biener <richard.guent...@gmail.com> writes:
> >>> > On Tue, Jun 30, 2020 at 2:18 PM Richard Sandiford
> >>> > <richard.sandif...@arm.com> wrote:
> >>> >>
> >>> >> "Yangfei (Felix)" <felix.y...@huawei.com> writes:
> >>> >> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> >b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> >>> >> > index e050db1a2e4..ea39fcac0e0 100644
> >>> >> > --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> >>> >> > +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> >>> >> > @@ -1,6 +1,7 @@
> >>> >> >  /* { dg-do compile } */
> >>> >> >  /* { dg-additional-options "-O3" } */
> >>> >> >  /* { dg-additional-options "-mavx2" { target { i?86-*-*
> >x86_64-*-* } } } */
> >>> >> > +/* { dg-additional-options "-march=armv8.2-a+sve
> >-fno-vect-cost-model" { target aarch64*-*-* } } */
> >>> >> >
> >>> >> >  typedef struct {
> >>> >> >      unsigned short mprr_2[5][16][16];
> >>> >>
> >>> >> This test is useful for Advanced SIMD too, so I think we should
> >continue
> >>> >> to test it with whatever options the person running the testsuite
> >chose.
> >>> >> Instead we could duplicate the test in gcc.target/aarch64/sve
> >with
> >>> >> appropriate options.
> >>> >>
> >>> >> > diff --git a/gcc/tree-vect-data-refs.c
> >b/gcc/tree-vect-data-refs.c
> >>> >> > index eb8288e7a85..b30a7d8a3bb 100644
> >>> >> > --- a/gcc/tree-vect-data-refs.c
> >>> >> > +++ b/gcc/tree-vect-data-refs.c
> >>> >> > @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment
> >(loop_vec_info loop_vinfo)
> >>> >> >               {
> >>> >> >                 poly_uint64 nscalars = (STMT_SLP_TYPE
> >(stmt_info)
> >>> >> >                                         ? vf * DR_GROUP_SIZE
> >(stmt_info) : vf);
> >>> >> > -               possible_npeel_number
> >>> >> > -                 = vect_get_num_vectors (nscalars, vectype);
> >>> >> > +               if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS
> >(vectype)))
> >>> >> > +                 possible_npeel_number = 0;
> >>> >> > +               else
> >>> >> > +                 possible_npeel_number
> >>> >> > +                   = vect_get_num_vectors (nscalars, vectype);
> >>> >> >
> >>> >> >                 /* NPEEL_TMP is 0 when there is no
> >misalignment, but also
> >>> >> >                    allow peeling NELEMENTS.  */
> >>> >>
> >>> >> OK, so this is coming from:
> >>> >>
> >>> >>   int s[16][2];
> >>> >>   …
> >>> >>   … =s[j][1];
> >>> >>
> >>> >> and an SLP node containing 16 instances of “s[j][1]”.  The
> >DR_GROUP_SIZE
> >>> >> is 2 because that's the inner dimension of “s”.
> >>> >>
> >>> >> I don't think maybe_lt is right here though.  The same problem
> >could in
> >>> >> principle happen for cases in which NSCALARS >
> >TYPE_VECTOR_SUBPARTS,
> >>> >> e.g. for different inner dimensions of “s”.
> >>> >>
> >>> >> I think the testcase shows that using DR_GROUP_SIZE in this
> >calculation
> >>> >> is flawed.  I'm not sure whether we can really do better given
> >the current
> >>> >> representation though.  This is one case where having a separate
> >dr_vec_info
> >>> >> per SLP node would help.
> >>> >
> >>> > I guess what the code likes to know is what we now have in
> >SLP_TREE_LANES
> >>> > (or formerly group_size) but that's not necessarily connected to
> >DR_GROUP_SIZE.
> >>> > Given we only see a stmt_info here and there's no 1:1 mapping of
> >SLP node
> >>> > to stmt_info (and the reverse mapping doesn't even exist) I do not
> >have
> >>> > any good idea either.
> >>> >
> >>> > Honestly I don't really see what this code tries to do ... doesn't
> >it
> >>> > simply want to set possible_npeel_number to
> TYPE_VECTOR_SUBPARTS
> >(vectype)?!
> >>>
> >>> I think it's trying to set possible_npeel_number to the number of
> >>> vector stmts per iteration (i.e. ncopies for non-SLP stuff):
> >>>
> >>>               /* For multiple types, it is possible that the bigger
> >type access
> >>>                  will have more than one peeling option.  E.g., a
> >loop with two
> >>>                  types: one of size (vector size / 4), and the other
> >one of
> >>>                  size (vector size / 8).  Vectorization factor will
> >8.  If both
> >>>                  accesses are misaligned by 3, the first one needs
> >one scalar
> >>>                  iteration to be aligned, and the second one needs
> >5.  But the
> >>>                  first one will be aligned also by peeling 5 scalar
> >>>                  iterations, and in that case both accesses will be
> >aligned.
> >>>                  Hence, except for the immediate peeling amount, we
> >also want
> >>>                  to try to add full vector size, while we don't
> >exceed
> >>>                  vectorization factor.
> >>>                  We do this automatically for cost model, since we
> >calculate
> >>>                  cost for every peeling option.  */
> >>>
> >>> So for this example, possible_npeel_number==2 for the first access
> >>> (letting us try two peeling values for it) and
> >possible_npeel_number==1
> >>> for the second.
> >>
> >> Yeah, but npeel is _scalar_ lanes, since we always peel scalar
> >iterations.
> >> So it seems odd to somehow put in the number of vectors...  so to me
> >> it would have made sense if it did
> >>
> >>   possible_npeel_number = lower_bound (nscalars);
> >>
> >> or whateveris necessary to make the polys happy.  Thus simply elide
> >> the vect_get_num_vectors call?  But it's been very long since I've
> >> dived into the alignment peeling code...
> >
> >Ah, I see what you mean.  So rather than:
> >
> >           /* Save info about DR in the hash table.  Also include peeling
> >              amounts according to the explanation above.  */
> >              for (j = 0; j < possible_npeel_number; j++)
> >                {
> >                  vect_peeling_hash_insert (&peeling_htab, loop_vinfo,
> >                                         dr_info, npeel_tmp);
> >               npeel_tmp += target_align / dr_size;
> >                }
> >
> >just have something like:
> >
> >           while (known_le (npeel_tmp, nscalars))
> >             {
> >               …
> >             }
> >
> >?
> 
> Yeah.

Not sure if I understand correctly.  I am supposing the following check in the 
original code is not necessary if we go like that.

1822               if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))

Is that correct?

Thanks,
Felix

Reply via email to