On 02/23/15 10:42, Steve Ellcey wrote:
No, I am thinking about the case where there are only non-fused multiply
add instructions available. To make sure I am using the right
terminology, I am using a non-fused multiply-add to mean a single fma
instruction that does '(a + (b * c))' but which rounds the result of '(b
* c)' before adding it to 'a' so that there is no difference in the
results between using this instruction and using individual add and mult
instructions. My understanding is that this is how the mips32r2 madd
instruction works.
Ahhh, nevermind, nothing I said was relevant then. I misunderstood
completely :-)
In this case there seems to be two ways to have GCC generate the fma
instruction. One is the current method using combine_instructions with
an instruction defined as:
(define_insn "*madd" (set (0) (plus (mult (1) (2))))
"madd.<fmt>\t%0,%3,%1,%2"
>
The other way would be to extend the convert_mult_to_fma so that instead
of:
if (FLOAT_TYPE_P (type)
&& flag_fp_contract_mode == FP_CONTRACT_OFF)
return false
it has something like:
if (FLOAT_TYPE_P (type)
&& (flag_fp_contract_mode == FP_CONTRACT_OFF)
&& !targetm.fma_does_rounding))
return false
And then define an instruction like:
(define_insn "fma" (set (0) (fma (1) (2) (3))))"
madd.<fmt>\t%0,%3,%1,%2"
The question I have is whether one or the other of these two approaches
would be better at creating fma instructions (vs leaving mult/add
combinations) or be might be preferable for some other reason.
The combiner pattern is useful in cases where we can't see the FMA at
gimple->rtl expansion time. But there may be cases where exposing the
FMA earlier is helpful as well.
So I think an argument could be easily made that we want to support both.
Jeff