> Hi, > > while playing with gcc-4.3 rev. 121994, i encountered a problem with > autovectorisation. > > In the following simple code, the inner loop of c1() becomes vectorized as > expected, but the inner loop of c2() not because of > > test2.c:15: note: ===== analyze_loop_nest ===== > test2.c:15: note: === vect_analyze_loop_form === > test2.c:15: note: === get_loop_niters === > test2.c:15: note: ==> get_loop_niters:(unsigned int) n_6(D) > test2.c:15: note: Symbolic number of iterations is (unsigned int) n_6(D) > test2.c:15: note: === vect_analyze_data_refs === > > test2.c:15: note: get vectype with 4 units of type float > test2.c:15: note: vectype: vector float > test2.c:15: note: not vectorized: unhandled data-ref > test2.c:15: note: bad data references. > > (even with -ftree-vectorizer-verbose=99 there is no more info than that) > > The only difference between the two functions is that in c1() static > arrays are used and in c2() pointer to arrays.. Is this a problem with > aliasing/alignment of pointer parameters or a vectorizer bug? And is there > a work-around? >
The first problem is that a[i] is invariant in the inner-loop, and the vectorizer wants to work only with data-references that have a nice evolution in the loop (i.e. advance between iterations of the loop). In other words - it assumes that invariant accesses had been moved out of the loop before vectorization: " ptr is loop invariant. create_data_ref: failed to create a dr for *pretmp.27_46 " The work around for that is to manually move the invariant a[i] out of the inner-loop, put it into a temporary, and use that temporary in the inner-loop. The second problem is aliasing - the vectorizer can't tell that the write through pointer o doesn't overlap with the read through pointer b. The work around for that is to add the "__restrict" qualifier to the declaration of the pointers. To fix the first problem in the compiler, we can teach the vectorizer to work with invariant datarefs. This is easy to do, but I think the right solution is to enhance loop-invariant-motion pass to use an aliasing oracle that would tell it that the invariant load can be safely moved out of the loop (given that the pointers are __restrict qualified). I think such a solution is in the works? Do people think it's worth while to work around this invariant-motion issue in the vectorizer? The second problem would be fixed in the near future - a patch that addds support for run-time aliasing checks is in the works (should be ready within a week or so I think). dorit > Best regards, > Thomas > > -- > > float a[256],b[16],o[271]; > > void c1() > { > for(int i=0;i<256;i++) { > for(int j=0;j<16;j++) { > o[i+j]+=a[i]*b[j]; > } > } > } > > void c2(int m, int n, float *a, float *b, float *o) > { > for(int i=0;i<m;i++) { > for(int j=0;j<n;j++) { > o[i+j]+=a[i]*b[j]; > } > } > }