https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101579

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Jakub Jelinek from comment #2)
> As
> typedef unsigned char V __attribute__((vector_size (32)));
> 
> V
> foo (V x)
> {
>   return __builtin_shufflevector (x, x, 0, 1, 2, 0, 5, 1, 0, 1, 3, 2, 3, 0,
> 4, 3, 1, 2,
>                                       2, 0, 4, 2, 3, 1, 1, 2, 3, 4, 1, 1, 0, 
> 0, 5, 2);
> }
> 
> V
> bar (V x)
> {
>   return __builtin_shufflevector (x, x, 0, 3, 3, 3, 3, 4, 5, 0, 1, 5, 2, 1,
> 0, 1, 1, 2,
>                                       3, 2, 0, 5, 4, 5, 1, 0, 1, 4, 4, 3, 4, 
> 5, 2, 0);
> }
> with -O2 -mavx2 is handled, I'd say this is veclower task to determine that
> the particular permutation could be cheaply implemented with two
> permutations of half-sized vectors and ask the backend if it supports those.
> Of course there can be other permutations that can't be implemented that way
> easily and might e.g. need more half-sized permutations...
I looks to me that middle end should be able to transform 64-byte vector
shuffle to 32-byte vector shuffle when data flow analysis shows the upper part
of the vector is never used.

Reply via email to