On Thu, May 21, 2015 at 07:39:16AM -0500, Segher Boessenkool wrote: > On Thu, May 21, 2015 at 08:06:04PM +0930, Alan Modra wrote: > > FAIL: gcc.target/powerpc/ti_math1.c scan-assembler-times adde 1 > > It doesn't trigger on big-endian; what is different?
Register dependencies. One of the arguments is in r4,r5, the return value in r3,r4. We calculate the low 64 bits first, which goes to r4 on big-endian, overlapping the argument. > > Trying 18, 9 -> 24: > > Failed to match this instruction: > > (set (reg:DI 4 4 [+8 ]) > > (plus:DI (plus:DI (reg:DI 5 5 [ val+8 ]) > > (reg:DI 76 ca)) > > (reg:DI 169 [+8 ]))) > > For some reason it has the CA reg not last. simplify-rtx.c:simplify_plus_minus_op_data_cmp > I think we should add to > the canonicalisation rules so that fixed regs sort after other regs. > That requires a lot of testing. What if you have two hard regs as above? Which of reg 5 and reg 76 sorts first? If they are sorted by register number, then ca appears in the wrong place. Reverse sorting hard regs might work for this pattern on powerpc, but that seems an odd choice. And if you say hard regs ought to keep their original order in rtl like the above, then it is no more difficult to keep all regs in their original order > > original costs 4 + 8 + 4 = 16 > > replacement costs 4 + 4 = 8 > > Still need to fix the costs as well (but they work as-is; well enough > that is). Yes, I noticed that too. > Are these copies guaranteed to (still) be in this basic block, > after the passes before combine? Did those passes do anything to > prevent moving it? I'm asking because it would be good to use the > same conditions in that case. Something I need to investigate. As I said, the patch was just a quick hack. -- Alan Modra Australia Development Lab, IBM