https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105354
--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> --- Created attachment 52893 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52893&action=edit Patch pending for GCC13 Now, we can also generate foo3: .LFB3: .cfi_startproc movdqa .LC1(%rip), %xmm2 pslldq $9, %xmm1 psrldq $5, %xmm0 pand %xmm2, %xmm0 pandn %xmm1, %xmm2 por %xmm2, %xmm0 ret .LC1: .byte -1 .byte -1 .byte -1 .byte -1 .byte -1 .byte -1 .byte -1 .byte -1 .byte -1 .byte -1 .byte 0 .byte 0 .byte 0 .byte 0 .byte 0 .byte 0 .align 16 for v16qi foo3 (v16qi a, v16qi b) { return __builtin_shufflevector (a, b, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 17, 18, 19, 20, 21, 22); }