https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93720
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Note the __builtin_shuffle expansion can be generialized to handle the case where there is more than elements but only one element insert: #define vector __attribute__((vector_size(16) )) vector float f(vector float a, vector float b) { return __builtin_shuffle (a, b, (vector int){0, 4, 2, 3}); } This function should generate: ins v0.s[1], v1.s[0] ret #define vector __attribute__((vector_size(16) )) vector float f1(vector float a, vector float b) { return __builtin_shuffle (b, a, (vector int){4, 0, 6, 7}); } This function should also generate: ins v0.s[1], v1.s[0] ret Even this: #define vector __attribute__((vector_size(16) )) vector float f(vector float a, vector float b) { return __builtin_shuffle (a, a, (vector int){0, 0, 2, 3}); } Should generate: ins v0.s[1], v0.s[0] ret