Andrew Pinski wrote: > Seems like you should do something similar to the integer madd/msub > instructions too (aarch64_mla is already correct but aarch64_mla_elt > needs this too).
Integer madd/msub may benefit too, however it wouldn't make a difference for a 3-operand mla since the register allocator already forces the same register for the destination and accumulator. Wilco