On Thu, Sep 13, 2012 at 11:25:42AM -0700, Richard Henderson wrote:
> (2) It's not the best match if we were to extend these builtins to FMA4.
>     There we really do have 4 inputs.  Thus

How could you extend these builtins to FMA4 BTW?  Doesn't FMA4 zero up the
high elements?  In that case you'd need to expand it as copy of the X
operand register to DEST, doing vfmadd{ss,sd} to a temp register and
followed by vmovss/vmovsd instruction.

> (define_insn "*fmai_fmadd_<mode>_4"
>   [(set (match_operand:VF_128 0 "register_operand" "=x,x")
>         (vec_merge:VF_128
>           (fma:VF_128
>             (match_operand:VF_128 1 "nonimmediate_operand" "%x,x")
>             (match_operand:VF_128 2 "nonimmediate_operand" " x,m")
>             (match_operand:VF_128 3 "nonimmediate_operand" "xm,x"))
>           (match_operand:VF_128 4 "register_operand" "0,0")
>           (const_int 1)))]
>   "TARGET_FMA4"
>   "vfmadd<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}"
>   [(set_attr "type" "ssemuladd")
>    (set_attr "mode" "<MODE>")])

        Jakub

Reply via email to