https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101579
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> --- As typedef unsigned char V __attribute__((vector_size (32))); V foo (V x) { return __builtin_shufflevector (x, x, 0, 1, 2, 0, 5, 1, 0, 1, 3, 2, 3, 0, 4, 3, 1, 2, 2, 0, 4, 2, 3, 1, 1, 2, 3, 4, 1, 1, 0, 0, 5, 2); } V bar (V x) { return __builtin_shufflevector (x, x, 0, 3, 3, 3, 3, 4, 5, 0, 1, 5, 2, 1, 0, 1, 1, 2, 3, 2, 0, 5, 4, 5, 1, 0, 1, 4, 4, 3, 4, 5, 2, 0); } with -O2 -mavx2 is handled, I'd say this is veclower task to determine that the particular permutation could be cheaply implemented with two permutations of half-sized vectors and ask the backend if it supports those. Of course there can be other permutations that can't be implemented that way easily and might e.g. need more half-sized permutations...