http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

--- Comment #4 from Michael Zolotukhin <michael.v.zolotukhin at gmail dot com> 
2011-12-28 13:01:51 UTC ---
(In reply to comment #2)
> > I though that if {vect_aligned_arrays} isn't true, than arrays could
> > be aligned even after peeling - that's why I added such check.
> 
> Sorry, I don't understand this sentence. What do you mean by aligned after
> peeling? Could you please explain what exactly happens on AVX (a dump file 
> with
> -fdump-tree-vect-details would be the best thing).
Sorry, I misspelled. I meant "than arrays couldn't be aligned" - at least
without some runtime checks. I.e. we can't peel some compile-time-known number
of iterations and be sure that array become aligned.

E.g., if we have array IA of ints aligned to 16-bytes, and we have access
IA[i+3], then peeling of one iteration will guarantee alignment to 16-byte. But
we don't know, how much iterations needs to be peeled to reach alignment to
32-bytes (as needed for AVX operations).

> > Unfortunately, I can't reproduce these fails, as I have no PowerPC. By
> > the way, if arrays aren't aligned on Power, why does GCC produce such
> > messages - does it really try to peel something? 
> 
> The arrays in the tests are aligned. I said that I think that we can't promise
> that all the arrays are vector aligned on power. BTW, we can peel for unknown
> misalignment as well.

In this case we shouldn't add Power to vector_aligned_arrays, I guess.

> > Maybe we should just
> > refine the check?
> > Anyway, if everything is ok with the tests (in original version) and
> > with gcc itself - we could check not for vect_aligned_arrays, but for
> > AVX. Please check
> > http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01600.html and the
> > attached to that letter patch.
> 
> I think that everything was ok, but I don't think that using 
> vect_sizes_32B_16B
> is a good idea. I would really like to see an AVX vect dump for eg.
> vect-peel-3.c.

In vect-peel-3.c we actually assume that vector length is 16 byte. Here is the
loop body:
      suma += ia[i];
      sumb += ib[i+5];
      sumc += ic[i+1];
When vector-size is 16, then peeling can make two of three accesses aligned,
but when vector size is 32 that's impossible. That's why using
vector_sizes_32B_16B might be correct here.

Also, I uploaded the dump you asked.

Michael

> Thanks,
> Ira
>

Reply via email to