https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101842

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #3)
> (In reply to Richard Biener from comment #2)
> > OK, so with a hack like the following we vectorize the BB as
> > 
> >   vect__1.10_62 = MEM <vector(4) float> [(float *)p_34];
> >   vect_powmult_9.11_61 = vect__1.10_62 * vect__1.10_62;
> >   _60 = .REDUC_PLUS (vect_powmult_9.11_61);
> >   d_25 = d_35 - _60;
> >   p_26 = p_34 + 16;
> >   i_27 = i_37 + 4;
> >   _10 = len_20(D) > i_27;
> >   _11 = lim_21(D) <= d_25;
> >   _12 = _10 & _11;
> >   if (_12 != 0)
> > 
> 
> Ah awesome!
> 
> > 
> > the hack simply re-starts reduction discovery at the "previous" stmt
> > (this breaks down after skipping the first stmt eventually).  As said,
> > it's a hack.  But is that the kind of vectorization you expect?
> 
> Yeah that looks perfect, the patch seems to be based on a different code
> than upstream so couldn't apply it to test the full loop, but this looks
> perfect! (We already vectorize a similar loop without the `&& d >= lim`
> condition).

It's applied to my working tree so that's possible.  Note it doesn't
vectorize the loop but the loop body in basic-block vectorization.

Reply via email to