https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92243
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Even lowering of VEC_PERM for this case: #define vector __attribute__((vector_size(16))) vector char f(vector char a) { a = __builtin_shufflevector (a, a,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0); return a; } ---- CUT --- could be done just as 2 (or 4) extraction followed by a bswap if the target does not support the VEC_PERM too.