Re: [PATCH] Enable Intel AVX512_4FMAPS and AVX512_4VNNIW instructions

Jakub Jelinek Fri, 18 Nov 2016 11:41:33 -0800

Hi!

On Thu, Nov 17, 2016 at 02:18:57PM -0800, H.J. Lu wrote:
> > Hi HJ, could you please commit it?
> 
> Done.


I'm seeing lots of ICEs with this.

E.g. reduced:

typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
typedef unsigned char __mmask8;
typedef float __v4sf __attribute__ ((__vector_size__ (16)));

static inline  __m128 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
_mm_setzero_ps (void)
{
  return __extension__ (__m128){ 0.0f, 0.0f, 0.0f, 0.0f };
}

 __m128
foo (__mmask8 __U, __m128 __A, __m128 __B, __m128 __C, __m128 __D, __m128 __E, 
__m128 *__F)
{
  return (__m128) __builtin_ia32_4fmaddss_mask ((__v4sf) __B,
      (__v4sf) __C,
      (__v4sf) __D,
      (__v4sf) __E,
      (__v4sf) __A,
      (const __v4sf *) __F,
      (__v4sf) _mm_setzero_ps (),
      (__mmask8) __U);
}

ICEs with -mavx5124fmaps -O0, but succeeds with
-mavx512vl -mavx5124fmaps -O0 or -mavx5124fmaps -O2.

            fcn_mask = gen_avx5124fmaddps_4fmaddss_mask;
            fcn_maskz = gen_avx5124fmaddps_4fmaddss_maskz;
            msk_mov   = gen_avx512vl_loadv4sf_mask;

looks wrong, while -mavx5124fmaps implies -mavx512f, it doesn't
imply -mavx512vl, so using -mavx512vl insns unconditionally is just wrong.
You need some fallback if avx512vl isn't available, perhaps use
avx512f 512-bit masked insns with bits in the mask forced to pick only the
ones you want?

Also, seems there are various formatting issues in the change,
e.g. shortly after s4fma_expand: there is indentation by 3 chars relative to
above { instead of 2, gen_rtx_SUBREG (V16SFmode, tmp, 0)); has extra 1 char
indentation, some lines too long.

        Jakub

Re: [PATCH] Enable Intel AVX512_4FMAPS and AVX512_4VNNIW instructions

Reply via email to