Re: IRA copy heuristics

Richard Sandiford Tue, 02 Sep 2008 11:35:39 -0700

Vladimir Makarov <[EMAIL PROTECTED]> writes:
> Richard Sandiford wrote:
>
>  > As I mentioned in:
>  >
>  >     http://gcc.gnu.org/ml/gcc/2008-08/msg00476.html
>  >
>  > I'd been working on a MIPS IRA port, but got side-tracked by a wrong-code
>  > regression.
>  >
>  > The regression was caused by incorrect EH liveness information. I tried
>  > to "fix" it by replacing the current note_stores-based forward scan with
>  > a DF-based backwards scan, which gave us the corrected information 
> for free.
>  > However, even after Vlad fixed a pessimisation wrt calls and
>  > DF_REF_MAY_CLOBBER, he saw a significant decrease in SPECFP performance.
>  > It seemed to me (and I think to Vlad) that the differences were caused
>  > by the "inverted" ALLOCNO_NUM ordering.  Before the patch, allocno
>  > indexes typically increased as you went from the beginning of a region
>  > to the end, whereas it was the other way around after the patch.
>  >
>
> I checked SPEC2000 again with your patches and there is no performance
> regression anymore.  Although I checked this with two patches which I
> submitted but did not get an approval yet.


Thanks for testing, and I suppose it's good news that the combined patches
show no regressions.  I'm a bit reluctant to actually apply the first patch
though.  I only wrote it to demonstrate the point I was trying to make.

My understanding -- correct me if I'm wrong -- is that the ALLOCNO_NUM
comparisons are simply tie-breakers to stabilise the sort.  If the
difference between a "index2 - index1" and "index1 - index2" makes a
significant difference, moving from one to the other just feels like
it's papering over a deeper problem.  At least until we know why the
difference matters (at which point we might decide there's no easy,
acceptable fix).

I agree that the lack of an xmm0 tie was the fundamental problem in the
particular example I gave.  Perhaps I didn't put enough emphasis on
that in the original message.  I was banging on about the copy heuristics
because, in the examples I looked at before, it was the the copy heuristics
that were leading us astray for certain ALLOCNO_NUM orderings.  And a
similar problem did occur here, even if it wasn't the primary cause of
the extra copy.

I'll look through my logs for an example that doesn't rely on hard
register ties.  (As I said, it takes quite a while to go through the
detailed differences for a particular test, so I'm not sure how quick
I'll be.)

>  > More fundamentally, C3 ends up having more influence than C2,
>  > even though the copies ostensibly have the same weight.  In other words,
>  > we have the following copy graph:
>  >
>  >     A58 <-> A63 <-> A64 <-> A60
>  >
>  > And the allocation of A63 has more influence over A64 than A60 does,
>  > because of the transitive influence of A58.  But shouldn't this
>  > transitive influence only kick in when A64 is allocated _between_
>  > A58 and A63?  Why should the allocation of A58 matter to A64 if
>  > A63 has already been allocated?
>  >
>
> I'll think about it and try to improve the heuristics. Indirect
> influence of allocno assignment to allocno not connected directly to
> given allocno was added to improve the code generation.

I understand that, and I can see why it's important when the allocation
order is something like A58, A64, A60, A63.  All other things being equal,
it's better to allocate H58 to A64 in that case.  The case I'm worried
about is when the indirect costs via Ax apply _after_ Ax has been
allocated.

> With my point of view, the major problem in this case is in that tying
> of xmm0 with r64 in insn #16 is ignored.  By the way such insns are not
> typical for x86 (I did not find any in SPEC2000). First of all cost
> calculation in regclass (used by the old register allocator) and in
> ira-costs.c (used by IRA) does not take such insns into account.  They
> take into account only move insn involving hard-register of small
> class and pseudo-register.  Ideally the cost for class SSE_FIRST_REG
> should be minimal and allocno for r64 should have SSE_FIRST_REG as the
> best reg class and SSE_REGS as an alternative register class.  It is
> not done.
>
> But the patch
>
> http://gcc.gnu.org/ml/gcc-patches/2008-08/msg02279.html
>
> should solve the problem.  I missed only one thing for this test.
> Instead of `only_regs_p' in ira-conflicts.c::process_regs_for_copy,
> there should be `only_regs_p && insn !=NULL_RTX' to distinguish
> move insn which was already taken into account in ira-costs.c.
> With the patch and the small change, r64 gets xmm0 and the problem
> disappears.  IRA will generate a better code than the
> old RA.

Thanks!

I originally sent the patches as attachments, but they seemed to
get inlined.  I'll try again with gzipped files, just in case.

Richard

compare-order.diff.gz
Description: Binary data

use-df.diff.gz
Description: Binary data

Re: IRA copy heuristics

Reply via email to