Ajit Agarwal <aagar...@linux.ibm.com> writes: >>> Thanks a lot. Can I know what should we be doing with neg (fma) >>> correctness failures with load fusion. >> >> I think it would involve: >> >> - describing lxvp and stxvp as unspec patterns, as I mentioned >> in the previous reply >> >> - making plain movoo split loads and stores into individual >> lxv and stxvs. (Or, alternative, it could use lxvp and stxvp, >> but internally swap the registers after load and before store.) >> That is, movoo should load the lower-numbered register from the >> lower address and the higher-numbered register from the higher >> address, and likewise for stores. >> > > Would you mind elaborating the above.
I think movoo should use rs6000_split_multireg_move for all alternatives, like movxo does. movoo should split into 2 V1TI loads/stores and movxo should split into 4 V1TI loads/stores. lxvp and stxvp would be independent patterns of the form: (set ... (unspec [...] UNSPEC_FOO)) --- rs6000_split_multireg_move has: /* The __vector_pair and __vector_quad modes are multi-register modes, so if we have to load or store the registers, we have to be careful to properly swap them if we're in little endian mode below. This means the last register gets the first memory location. We also need to be careful of using the right register numbers if we are splitting XO to OO. */ But I don't see how this can work reliably if we allow the kind of subregs that you want to create here. The register order is the opposite from the one that GCC expects. This is more a question for the PowerPC maintainers though. And this is one of the (admittedly many) times when I wish GCC's subreg model was more like LLVM's. :) Thanks, Richard