Richard Sandiford wrote:

As far as madd goes, I think it would be better to either
(a) get combine to handle this situation or (b) get expand
to generate a fused multiply-add from the outset.

(b) sounds like it might be useful in its own right.  At the moment we
treat the generation of floating-point multiply-adds as an optimisation,
but in some applications it's critical not to round the intermediate
result.  (I don't know if there's a bugzilla entry about this.)
If we treated fused multiply-add as a primitive operation, we could
extend it to integer types too.  In this case we'd also need to
handle widening multiplications, but we already need to do that
for stand-alone multiplications.

Richard

While I agree with you philosophically, it feels like (b) might be quite a major task. A number of optimisation passes which currently recognise and MUL and PLUS separately (e.g. loop strength reduction) would now need to be extended to handle the fused MULPLUS and MULSUB operators.

And although the reduction in instruction count due to your previous change is good, what is it as a percentage of the total? After all it only helps code which uses 64-bit integer types with a 32-bit ABI, which is probably quite a small proportion of most real-life applications -- whereas for some algorithms the ability to use MADD is absolutely critical to performance, and for them losing the ability to generate MADD is a significant backward step for the compiler.

How about, as a workaround until (b) sees the light of day, we reimplement adddi3 and subdi3 only (not the other di mode patterns), qualified by ISA_HAS_MADD_MSUB. Perhaps they could also be implemented more cleanly nowadays, using define_insn_and_split and/or a "#" template, to avoid generating multi-instruction assembler sequences.

Nigel

Reply via email to