Hello, Today's powerpc64-linux gcc has 2 extra failures with -mlra vs. reload (i.e. svn unpatched).
(I'm excluding guality failure differences here because there are too many of them that seem to fail at random after minimal changes anywhere in the compiler...). Test results are posted here: reload: http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg00128.html lra: http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg00129.html The new failures and total score is as follows (+=lra, -=reload): +FAIL: gcc.target/powerpc/pr53199.c scan-assembler-times stwbrx 6 +FAIL: gcc.target/powerpc/pr58330.c scan-assembler-not stwbrx === gcc Summary === -# of expected passes 97887 -# of unexpected failures 536 +# of expected passes 97903 +# of unexpected failures 538 # of unexpected successes 38 # of expected failures 244 -# of unsupported tests 1910 +# of unsupported tests 1892 The failure of pr53199.c is because of different instruction selection for bswap. Test case is reduced to just one function: /* { dg-options "-O2 -mcpu=power6 -mavoid-indexed-addresses" } */ long long reg_reverse (long long x) { return __builtin_bswap64 (x); } Reload left vs. LRA right: reg_reverse: reg_reverse: srdi 8,3,32 | addi 8,1,-16 rlwinm 7,3,8,0xffffffff | srdi 10,3,32 rlwinm 9,8,8,0xffffffff | addi 9,8,4 rlwimi 7,3,24,0,7 | stwbrx 3,0,8 rlwimi 7,3,24,16,23 | stwbrx 10,0,9 rlwimi 9,8,24,0,7 | ld 3,-16(1) rlwimi 9,8,24,16,23 < sldi 7,7,32 < or 7,7,9 < mr 3,7 < blr blr This same difference is responsible for the failure of pr58330.c which also uses __builtin_bswap64(). The difference in RTL for the test case is this (after reload vs. after LRA): - 11: {%7:DI=bswap(%3:DI);clobber %8:DI;clobber %9:DI;clobber %10:DI;} - 20: %3:DI=%7:DI + 20: %8:DI=%1:DI-0x10 + 21: %8:DI=%8:DI // stupid no-op move + 11: {[%8:DI]=bswap(%3:DI);clobber %9:DI;clobber %10:DI;clobber scratch;} + 19: %3:DI=[%1:DI-0x10] So LRA believes going through memory is better than using a register, even though obviously there are plenty registers available. What LRA does: Creating newreg=129 Removing SCRATCH in insn #11 (nop 2) Creating newreg=130 Removing SCRATCH in insn #11 (nop 3) Creating newreg=131 Removing SCRATCH in insn #11 (nop 4) // at this point the insn would be a bswapdi2_64bit: // 11: {%3:DI=bswap(%3:DI);clobber r129;clobber r130;clobber r131;} // cost calculation for the insn alternatives: 0 Early clobber: reject++ 1 Non-pseudo reload: reject+=2 1 Spill pseudo in memory: reject+=3 2 Scratch win: reject+=2 3 Scratch win: reject+=2 4 Scratch win: reject+=2 alt=0,overall=18,losers=1,rld_nregs=0 0 Non-pseudo reload: reject+=2 0 Spill pseudo in memory: reject+=3 0 Non input pseudo reload: reject++ 2 Scratch win: reject+=2 3 Scratch win: reject+=2 alt=1,overall=16,losers=1,rld_nregs=0 Staticly defined alt reject+=12 0 Early clobber: reject++ 2 Scratch win: reject+=2 3 Scratch win: reject+=2 4 Scratch win: reject+=2 0 Conflict early clobber reload: reject-- alt=2,overall=24,losers=1,rld_nregs=0 Choosing alt 1 in insn 11: (0) Z (1) r (2) &b (3) &r (4) X {*bswapdi2_64bit} Change to class BASE_REGS for r129 Change to class GENERAL_REGS for r130 Creating newreg=132 from oldreg=3, assigning class NO_REGS to r132 Change to class NO_REGS for r131 11: {r132:DI=bswap(%3:DI);clobber r129:DI;clobber r130:DI;clobber r131:DI;} REG_UNUSED r131:DI REG_UNUSED r130:DI REG_UNUSED r129:DI LRA selects alternative 1 (Z,r,&b,&r,X) which seems to be the right choice, from looking at the constraints. Reload selects alternative 2 which is slightly^2 discouraged: (??&r,r,&r,&r,&r). Is this an improvement or a regression? If it's an improvement then these two test cases should be adjusted :-) Ciao! Steven