http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

--- Comment #3 from Steven Bosscher <steven at gcc dot gnu.org> 2012-08-01 
10:13:32 UTC ---
With "GCC: (GNU) 4.8.0 20120731 (experimental) [trunk revision 190015]" the
dumps look slightly different. I'm using the -fdump-rtl-all-slim dumps (with a
local patch to dump SEQUENCEs also) and have this dump, showing the same
problem:

  BEFORE REGRENAME (.207r.ce3)   ----> AFTER REGRENAME (.208r.rnreg)
  154 r4:SI=r0:SI                  =   154 r4:SI=r0:SI
      REG_DEAD: r0:SI              =       REG_DEAD: r0:SI
  155 r5:SI=r1:SI                  =   155 r5:SI=r1:SI
      REG_DEAD: r1:SI              =       REG_DEAD: r1:SI
  144 r2:DF=[sp:SI+0x28]           |   144 r0:DF=[sp:SI+0x28]
   80 [sp:SI]=r2:DF                |    80 [sp:SI]=r0:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  145 r2:DF=[sp:SI+0x38]           =   145 r2:DF=[sp:SI+0x38]
   81 [sp:SI+0x8]=r2:DF            =    81 [sp:SI+0x8]=r2:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  146 r2:DF=[sp:SI+0x48]           |   146 r1:DF=[sp:SI+0x48]
   82 [sp:SI+0x10]=r2:DF           |    82 [sp:SI+0x10]=r1:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  156 r0:SI=r4:SI                  =   156 r0:SI=r4:SI
  157 r1:SI=r5:SI                  =   157 r1:SI=r5:SI
   84 r2:DF=[sp:SI+0x18]           =    84 r2:DF=[sp:SI+0x18]
   85 r0:DF=call [`bar'] argc:0x18 =    85 r0:DF=call [`bar'] argc:0x18
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
      REG_UNUSED: r0:DF            =       REG_UNUSED: r0:DF
  147 r2:DF=[sp:SI+0x30]           |   147 r0:DF=[sp:SI+0x30]
   86 [sp:SI]=r2:DF                |    86 [sp:SI]=r0:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  148 r2:DF=[sp:SI+0x40]           =   148 r2:DF=[sp:SI+0x40]
   87 [sp:SI+0x8]=r2:DF            =    87 [sp:SI+0x8]=r2:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  149 r2:DF=[sp:SI+0x50]           |   149 r1:DF=[sp:SI+0x50]
   88 [sp:SI+0x10]=r2:DF           |    88 [sp:SI+0x10]=r1:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  158 r0:SI=r4:SI                  =   158 r0:SI=r4:SI
      REG_DEAD: r4:SI              =       REG_DEAD: r4:SI
  159 r1:SI=r5:SI                  =   159 r1:SI=r5:SI
      REG_DEAD: r5:SI              =       REG_DEAD: r5:SI
   90 r2:DF=[sp:SI+0x20]           =    90 r2:DF=[sp:SI+0x20]
   91 r0:DF=call [`bar'] argc:0x18 =    91 r0:DF=call [`bar'] argc:0x18
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
      REG_UNUSED: r0:DF            =       REG_UNUSED: r0:DF

The sets of r4 and r5 are actually not used for anything other than
preserving/reloading the values of r0 and r1 for the second call to bar. To
understand how the sets of r4 and r5 come into existence to begin with, we need
to look at the pre-regalloc dumps, e.g. the .191r.asmcons dump:

   77 r0:DF=call [`__aeabi_ddiv'] argc:0
      REG_DEAD: r2:DF
      REG_EH_REGION: 0xffffffff80000000
   78 r177:DF=r0:DF
      REG_DEAD: r0:DF
   80 [sp:SI]=r166:DF
   81 [sp:SI+0x8]=r168:DF
   82 [sp:SI+0x10]=r170:DF
   83 r0:DF=r177:DF
   84 r2:DF=r164:DF
   85 r0:DF=call [`bar'] argc:0x18
      REG_DEAD: r2:DF
      REG_UNUSED: r0:DF
   86 [sp:SI]=r167:DF
   87 [sp:SI+0x8]=r169:DF
   88 [sp:SI+0x10]=r171:DF
   89 r0:DF=r177:DF
      REG_DEAD: r177:DF

In insn 78, r177 is used to memoize the call result from __aeabi_ddiv (which is
the variable "t" in the source code). IRA goes to work on this and finds:

Reg 177: local to bb 4 def dominates all uses has unique first use
Found def insn 78 for 177 to be not moveable
(insn 78 is not moveable because a hard register is involved in the SET_SRC)
   Insn 78(l0): point = 37
   Insn 89(l0): point = 17
   Insn 88(l0): point = 19

;; a5(r177,l0) conflicts:
;;   subobject 0: a1(r159,l0) a2(r172,l0) a3(r163,l0) a4(r165,w0,l0)
a4(r165,w1,l0) a6(r171,w0,l0) a6(r171,w1,l0) a7(r169,w0,l0) a7(r169,w1,l0)
a8(r167,w0,l0) a8(r167,w1,l0) a9(r164,w0,l0) a9(r164,w1,l0) a10(r170,w0,l0)
a10(r170,w1,l0) a11(r168,w0,l0) a11(r168,w1,l0) a12(r166,w0,l0) a12(r166,w1,l0)
a0(r175,l0)
;;     total conflict hard regs: 0-3 12
;;     conflict hard regs: 0-3 12

So r177 conflicts with r0 and can't be coalesced, and r177 ends up allocated to
the first available hard register, which is r4. In .198r.split2, the DFmode set
in insn 78 is split to set r4 and r5.

This issue can in theory be fixed before reload: You'd have to copy-propagate
the hard-register set of r177 in insn 78 to its use in insn 83. There is a risk
that this won't work in general because you can't know before reload whether r0
will be needed for reloads in the interlying insns and you may end up
increasing register pressure and spoiling the code. Therefore you'd want to
propagate as late as possible. That would be the regmove pass.

I'm trying something, will post later today...

Reply via email to