On Thu, Sep 13, 2012 at 11:25:42AM -0700, Richard Henderson wrote: > (2) It's not the best match if we were to extend these builtins to FMA4. > There we really do have 4 inputs. Thus
How could you extend these builtins to FMA4 BTW? Doesn't FMA4 zero up the high elements? In that case you'd need to expand it as copy of the X operand register to DEST, doing vfmadd{ss,sd} to a temp register and followed by vmovss/vmovsd instruction. > (define_insn "*fmai_fmadd_<mode>_4" > [(set (match_operand:VF_128 0 "register_operand" "=x,x") > (vec_merge:VF_128 > (fma:VF_128 > (match_operand:VF_128 1 "nonimmediate_operand" "%x,x") > (match_operand:VF_128 2 "nonimmediate_operand" " x,m") > (match_operand:VF_128 3 "nonimmediate_operand" "xm,x")) > (match_operand:VF_128 4 "register_operand" "0,0") > (const_int 1)))] > "TARGET_FMA4" > "vfmadd<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}" > [(set_attr "type" "ssemuladd") > (set_attr "mode" "<MODE>")]) Jakub