Dorit Nuzman/Haifa/IBM wrote on 05/02/2007 21:13:40:

> Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:59:00:
>
> > On Mon, 5 Feb 2007, Paolo Bonzini wrote:
> >
> > >
> > > > As we also only vectorize innermost loops I believe doing a
> > > > complete unrolling pass early will help in general (I pushed
> > > > for this some time ago).
> > > >
> > > > Thoughts?
> > >
> > > It might also hurt, though, since we don't have a basic block
vectorizer.
> > > IIUC the vectorizer is able to turn
> > >
> > >   for (i = 0; i < 4; i++)
> > >     v[i] = 0.0;
> > >
> > > into
> > >
> > >   *(vector double *)v = (vector double){0.0, 0.0, 0.0, 0.0};
> >
> > That's true.
>
> That's going to change once this project goes in: "(3.2) Straight-
> line code vectorization" from http://gcc.gnu.
> org/wiki/AutovectBranchOptimizations. In fact, I think in autovect-
> branch, if you unroll the above loop it should get vectorized
> already. Ira - is that really the case?

The completely unrolled loop will not get vectorized because the code will
not be inside any loop (and our SLP implementation will focus, at least as
a first step, on loops).
The following will get vectorized (without permutation on autovect branch,
and with redundant permutation on mainline):

for (i = 0; i < n; i++)
  {
    v[4*i] = 0.0;
    v[4*i + 1] = 0.0;
    v[4*i + 2] = 0.0;
    v[4*i + 3] = 0.0;
  }

The original completely unrolled loop will get vectorized if it is
encapsulated in an outer-loop, like so:

for (j=0; j<n; j++)
  {
      for (i = 0; i < 4; i++)
         v[i] = 0.0;
      v += 4;
  }

Ira


Reply via email to