Hi! On Thu, Nov 17, 2016 at 02:18:57PM -0800, H.J. Lu wrote: > > Hi HJ, could you please commit it? > > Done.
I'm seeing lots of ICEs with this. E.g. reduced: typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); typedef unsigned char __mmask8; typedef float __v4sf __attribute__ ((__vector_size__ (16))); static inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_setzero_ps (void) { return __extension__ (__m128){ 0.0f, 0.0f, 0.0f, 0.0f }; } __m128 foo (__mmask8 __U, __m128 __A, __m128 __B, __m128 __C, __m128 __D, __m128 __E, __m128 *__F) { return (__m128) __builtin_ia32_4fmaddss_mask ((__v4sf) __B, (__v4sf) __C, (__v4sf) __D, (__v4sf) __E, (__v4sf) __A, (const __v4sf *) __F, (__v4sf) _mm_setzero_ps (), (__mmask8) __U); } ICEs with -mavx5124fmaps -O0, but succeeds with -mavx512vl -mavx5124fmaps -O0 or -mavx5124fmaps -O2. fcn_mask = gen_avx5124fmaddps_4fmaddss_mask; fcn_maskz = gen_avx5124fmaddps_4fmaddss_maskz; msk_mov = gen_avx512vl_loadv4sf_mask; looks wrong, while -mavx5124fmaps implies -mavx512f, it doesn't imply -mavx512vl, so using -mavx512vl insns unconditionally is just wrong. You need some fallback if avx512vl isn't available, perhaps use avx512f 512-bit masked insns with bits in the mask forced to pick only the ones you want? Also, seems there are various formatting issues in the change, e.g. shortly after s4fma_expand: there is indentation by 3 chars relative to above { instead of 2, gen_rtx_SUBREG (V16SFmode, tmp, 0)); has extra 1 char indentation, some lines too long. Jakub