On Thu, Oct 18, 2018 at 1:54 PM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On 10/18/18, Uros Bizjak <ubiz...@gmail.com> wrote:
> > On Thu, Oct 18, 2018 at 1:19 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> >> >> +(define_insn
> >> >> "*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_1"
> >> >> +  [(set (match_operand:VF_AVX512 0 "register_operand" "=v,v")
> >> >> +       (fma:VF_AVX512
> >> >> +         (match_operand:VF_AVX512 1 "nonimmediate_operand" "0,v")
> >> >> +         (match_operand:VF_AVX512 2 "nonimmediate_operand" "v,0")
> >> >> +         (vec_duplicate:VF_AVX512
> >> >> +           (match_operand:<ssescalarmode> 3 "nonimmediate_operand"
> >> >> "m,m"))))]
> >> >
> >> > Please note that having "nonimmediate_operand" predicate with "m"
> >> > constraint will force scalar value that lives in any register to
> >> > memory. So, scalar value will be pushed from either integer or SSE
> >> > register to memory, and will be broadcast to SSE register from here. I
> >> > guess this is not the optimal way, and we still want (eventual movq
> >> > from integer reg) + broadcast insn in this case.
> >> >
> >> > If this predicate is changed to "memory_operand", then only scalars
> >> > that live in memory will be considered.
> >>
> >> Using "memory_operand" causes:
> >>
> >> FAIL: gcc.target/i386/avx512f-fmadd-sf-zmm-7.c scan-assembler-times
> >> vfmadd...ps[ \\t]+[^\n\r]+\\{1to[1-8]+\\}, %zmm[0-9]+, %zmm0 1
> >> FAIL: gcc.target/i386/avx512f-fmadd-sf-zmm-7.c scan-assembler-not
> >> vbroadcastss[^\n]*%zmm[0-9]+
> >>
> >> __m512
> >> foo (__m512 x, __m512 y)
> >> {
> >>   return _mm512_fmadd_ps (x, y, _mm512_set1_ps (2.f));
> >> }
> >>
> >> Combiner:
> >>
> >> Failed to match this instruction:
> >> (set (reg:V16SF 91)
> >>     (fma:V16SF (reg/v:V16SF 85 [ x ])
> >>         (reg:V16SF 21 xmm1 [ y ])
> >>         (vec_duplicate:V16SF (reg:SF 88))))
> >
> > This is expected, there is no memory operand there. Can you check what
> > prevents combiner from propagating memory into the insn?
>
> Failed to match this instruction:
> (set (reg:V16SF 91)
>     (fma:V16SF (reg/v:V16SF 85 [ x ])
>         (reg:V16SF 21 xmm1 [ y ])
>         (const_vector:V16SF [
>                 (const_double:SF 2.0e+0 [0x0.8p+2]) repeated x16
>             ])))
>
> I don't know if we want to fix it.
>
> Here is the updated patch with "memory_operand".  OK for trunk?

LGTM.

Thanks,
Uros.

Reply via email to