https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95524
--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> --- (In reply to Hongtao.liu from comment #0) > icc has > --- > ashift(char __vector(16)): > vpsllw xmm1, xmm0, 5 #9.16 > vpand xmm0, xmm1, XMMWORD PTR .L_2il0floatpacket.0[rip] #9.16 > ret #9.16 > ashift2(char __vector(32), char __vector(32)): > vpsllw ymm2, ymm0, 5 #15.16 > vpand ymm0, ymm2, YMMWORD PTR .L_2il0floatpacket.1[rip] #15.16 > ret #15.16 > ashiftrt(char __vector(16)): > vpsrlw xmm1, xmm0, 5 #21.16 > vpand xmm0, xmm1, XMMWORD PTR .L_2il0floatpacket.2[rip] #21.16 > ret #21.16 > arshiftrt2(char __vector(32)): > vpsrlw ymm1, ymm0, 5 #27.16 > vpand ymm0, ymm1, YMMWORD PTR .L_2il0floatpacket.3[rip] #27.16 > ret #27.16 > .long > ICC seems to generate inaccurate instructions for ashiftrt, but clang is right, still better than gcc, refer to https://godbolt.org/z/ttV5xY