I'm doing some experiments to get to know GCC better, and something is puzzling me.
I have defined an md file with DFA and costs describing the fact that loads take a while (as do stores). Also, there is no memory to memory move, only memory to/from register. Test program is basically a=b; c=d; e=f; g=h; Sched1, as expected, turns this into four loads followed by four stores, exploiting the pipeline. Then IRA kicks in. It shuffles the insns back into load/store, load/store pairs, essentially the source code order. It looks like it's doing that to reduce the number of registers used. Fair enough, but this makes the code less efficient. I don't see a way to tell IRA not to do this. As it happens, there's a secondary reload involved: the loads are into one set of registers but the stores from another, so a register to register move is added in by reload. Does that explain the behavior? I tried changing the cover_classes, but that doesn't make a difference. paul