On 2/15/07, Dorit Nuzman <[EMAIL PROTECTED]> wrote:
> Hi,
>
> while playing with gcc-4.3 rev. 121994, i encountered a problem with
> autovectorisation.
>
> In the following simple code, the inner loop of c1() becomes vectorized
as
> expected, but the inner loop of c2() not because of
>
>    test2.c:15: note: ===== analyze_loop_nest =====
>    test2.c:15: note: === vect_analyze_loop_form ===
>    test2.c:15: note: === get_loop_niters ===
>    test2.c:15: note: ==> get_loop_niters:(unsigned int) n_6(D)
>    test2.c:15: note: Symbolic number of iterations is (unsigned int)
n_6(D)
>    test2.c:15: note: === vect_analyze_data_refs ===
>
>    test2.c:15: note: get vectype with 4 units of type float
>    test2.c:15: note: vectype: vector float
>    test2.c:15: note: not vectorized: unhandled data-ref
>    test2.c:15: note: bad data references.
>
> (even with -ftree-vectorizer-verbose=99 there is no more info than that)
>
> The only difference between the two functions is that in c1() static
> arrays are used and in c2() pointer to arrays.. Is this a problem with
> aliasing/alignment of pointer parameters or a vectorizer bug? And is
there
> a work-around?
>

The first problem is that a[i] is invariant in the inner-loop, and the
vectorizer wants to work only with data-references that have a nice
evolution in the loop (i.e. advance between iterations of the loop). In
other words - it assumes that invariant accesses had been moved out of the
loop before vectorization:

"
ptr is loop invariant.

create_data_ref: failed to create a dr for *pretmp.27_46
"

The work around for that is to manually move the invariant a[i] out of the
inner-loop, put it into a temporary, and use that temporary in the
inner-loop.

The second problem is aliasing - the vectorizer can't tell that the write
through pointer o doesn't overlap with the read through pointer b.

The work around for that is to add the "__restrict" qualifier to the
declaration of the pointers.

To fix the first problem in the compiler, we can teach the vectorizer to
work with invariant datarefs. This is easy to do, but I think the right
solution is to enhance loop-invariant-motion pass to use an aliasing oracle
that would tell it that the invariant load can be safely moved out of the
loop (given that the pointers are __restrict qualified). I think such a
solution is in the works?

It is.

Do people think it's worth while to work around this invariant-motion issue
in the vectorizer?

Probably not, it's just going to make your code more complex for no real gain.

Reply via email to