Dorit Nuzman/Haifa/IBM wrote on 05/02/2007 21:13:40:
> Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:59:00: > > > On Mon, 5 Feb 2007, Paolo Bonzini wrote: > > > > > > > > > As we also only vectorize innermost loops I believe doing a > > > > complete unrolling pass early will help in general (I pushed > > > > for this some time ago). > > > > > > > > Thoughts? > > > > > > It might also hurt, though, since we don't have a basic block vectorizer. > > > IIUC the vectorizer is able to turn > > > > > > for (i = 0; i < 4; i++) > > > v[i] = 0.0; > > > > > > into > > > > > > *(vector double *)v = (vector double){0.0, 0.0, 0.0, 0.0}; > > > > That's true. > > That's going to change once this project goes in: "(3.2) Straight- > line code vectorization" from http://gcc.gnu. > org/wiki/AutovectBranchOptimizations. In fact, I think in autovect- > branch, if you unroll the above loop it should get vectorized > already. Ira - is that really the case? The completely unrolled loop will not get vectorized because the code will not be inside any loop (and our SLP implementation will focus, at least as a first step, on loops). The following will get vectorized (without permutation on autovect branch, and with redundant permutation on mainline): for (i = 0; i < n; i++) { v[4*i] = 0.0; v[4*i + 1] = 0.0; v[4*i + 2] = 0.0; v[4*i + 3] = 0.0; } The original completely unrolled loop will get vectorized if it is encapsulated in an outer-loop, like so: for (j=0; j<n; j++) { for (i = 0; i < 4; i++) v[i] = 0.0; v += 4; } Ira