On Mon, Apr 24, 2017 at 09:33:09AM +0200, Allan Sandfeld Jensen wrote: > --- a/gcc/config/i386/avx2intrin.h > +++ b/gcc/config/i386/avx2intrin.h > @@ -667,7 +667,7 @@ extern __inline __m256i > __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > _mm256_slli_epi16 (__m256i __A, int __B) > { > - return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B); > + return ((__B & 0xff) < 16) ? (__m256i)((__v16hi)__A << (__B & 0xff)) : > _mm256_setzero_si256(); > }
What is the advantage of doing that when you replace one operation with several (&, <, ?:, <<)? I'd say instead we should fold the builtins if in the gimple fold target hook we see the shift count constant and can decide based on that. Or we could use __builtin_constant_p (__B) to decide whether to use the generic vector shifts or builtin, but that means larger IL. Jakub