On Wed, Aug 26, 2009 at 10:47 PM, Peter Bergner<berg...@vnet.ibm.com> wrote: > On Mon, 2009-08-24 at 23:56 +0000, Charles J. Tabony wrote: >> I am seeing a performance regression on the port I maintain, and I would >> appreciate some pointers. >> >> When I compile the following code >> >> void f(int *x, int *y){ >> *x = 7; >> *y = 4; >> } >> >> with GCC 4.3.2, I get the desired sequence of instructions. I'll call it >> sequence A: >> >> r0 = 7 >> r1 = 4 >> [x] = r0 >> [y] = r1 >> >> When I compile the same code with GCC 4.4.0, I get a sequence that is lower >> performance for my target machine. I'll call it sequence B: >> >> r0 = 7 >> [x] = r0 >> r0 = 4 >> [y] = r0 > > This is caused by update_equiv_regs() which IRA inherited from local-alloc.c. > Although with gcc 4.3 and earlier, you don't see the problem, it is still > there, > because if you look at the 4.3 dumps, you will see that update_equiv_regs() > unordered them for us. What is saving us is that sched2 reschedules them > again for us in the order we want. With 4.4, IRA happens to reuse the same > register for both pseudos, so sched2 is hand tied and cannot schedule them > back again for us. > > Looking at update_equiv_regs(), if I disable the replacement for regs > that are local to one basic block (patch below) like it existed before > John Wehle's patch way back in Oct 2000: > > http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00782.html > > then we get the ordering we want. Does anyone know why John removed > that part of the test in his patch? Thoughts anyone?
Hmm. I suppose if you conditionalize it on flag_schedule_insns it might be an overall win. Care to SPEC test that change? Thanks, Richard. > > Peter > > > Index: ira.c > =================================================================== > --- ira.c (revision 151111) > +++ ira.c (working copy) > @@ -2510,6 +2510,7 @@ update_equiv_regs (void) > calls. */ > > if (REG_N_REFS (regno) == 2 > + && REG_BASIC_BLOCK (regno) < NUM_FIXED_BLOCKS > && (rtx_equal_p (x, src) > || ! equiv_init_varies_p (src)) > && NONJUMP_INSN_P (insn) > > > >