https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Looking at what other compilers emit for this, ICC seems to be completely broken, it emits logical right shifts instead of arithmetic right shift, and LLVM trunk emits for >> 63 what this patch emits, for >> 17 it emits vpsrad $17, %xmm0, %xmm1 vpsrlq $17, %xmm0, %xmm0 vpblendd $10, %xmm1, %xmm0, %xmm0 instead of vpxor %xmm1, %xmm1, %xmm1 vpcmpgtq %xmm0, %xmm1, %xmm1 vpsrlq $17, %xmm0, %xmm0 vpsllq $47, %xmm1, %xmm1 vpor %xmm1, %xmm0, %xmm0 the patch emits. For >> 47 it emits: vpsrad $31, %xmm0, %xmm1 vpsrad $15, %xmm0, %xmm0 vpshufd $245, %xmm0, %xmm0 vpblendd $10, %xmm1, %xmm0, %xmm0 etc. So, in summary, for >> 63 with SSE4.2 I think what the patch does looks best, for >> 63 and SSE2 we can emit psrad $31 instead and permute the odd elements into even ones (i.e. __builtin_shuffle ((v4si) x >> 31, { 1, 1, 3, 3 })). For >> cst where cst < 32, do a psrad and psrlq by that cst and permute such that we get the even SI elts from the psrlq result and odd from psrad result. For >> 32, do a psrad $31 and permute to get the even SI elts from odd elts of the source and odd SI elts from odd results of psrad $31. For >> cst where cst > 32, do psrad $31 and psrad $(cst-32) and permute such that even SI elts come from odd elts of the latter and odd elts come from odd elts of the former.