Peter Bergner wrote:
On Mon, 2009-08-24 at 23:56 +0000, Charles J. Tabony wrote:
I am seeing a performance regression on the port I maintain, and I would
appreciate some pointers.
When I compile the following code
void f(int *x, int *y){
*x = 7;
*y = 4;
}
with GCC 4.3.2, I get the desired sequence of instructions. I'll call it
sequence A:
r0 = 7
r1 = 4
[x] = r0
[y] = r1
When I compile the same code with GCC 4.4.0, I get a sequence that is lower
performance for my target machine. I'll call it sequence B:
r0 = 7
[x] = r0
r0 = 4
[y] = r0
This is caused by update_equiv_regs() which IRA inherited from local-alloc.c.
Although with gcc 4.3 and earlier, you don't see the problem, it is still there,
because if you look at the 4.3 dumps, you will see that update_equiv_regs()
unordered them for us. What is saving us is that sched2 reschedules them
again for us in the order we want. With 4.4, IRA happens to reuse the same
register for both pseudos, so sched2 is hand tied and cannot schedule them
back again for us.
Peter, thanks for the investigation.
We could do update_equiv_regs in a separate pass before the 1st insn
scheduling as it was before IRA.
I'll try this and see how will it work for mainstream targets (x86, ppc).
Looking at update_equiv_regs(), if I disable the replacement for regs
that are local to one basic block (patch below) like it existed before
John Wehle's patch way back in Oct 2000:
http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00782.html
then we get the ordering we want. Does anyone know why John removed
that part of the test in his patch? Thoughts anyone?
I have no idea. But if it works well, we could use it.