Vladimir Makarov <[EMAIL PROTECTED]> writes: > Richard Sandiford wrote: > > > As I mentioned in: > > > > http://gcc.gnu.org/ml/gcc/2008-08/msg00476.html > > > > I'd been working on a MIPS IRA port, but got side-tracked by a wrong-code > > regression. > > > > The regression was caused by incorrect EH liveness information. I tried > > to "fix" it by replacing the current note_stores-based forward scan with > > a DF-based backwards scan, which gave us the corrected information > for free. > > However, even after Vlad fixed a pessimisation wrt calls and > > DF_REF_MAY_CLOBBER, he saw a significant decrease in SPECFP performance. > > It seemed to me (and I think to Vlad) that the differences were caused > > by the "inverted" ALLOCNO_NUM ordering. Before the patch, allocno > > indexes typically increased as you went from the beginning of a region > > to the end, whereas it was the other way around after the patch. > > > > I checked SPEC2000 again with your patches and there is no performance > regression anymore. Although I checked this with two patches which I > submitted but did not get an approval yet.
Thanks for testing, and I suppose it's good news that the combined patches show no regressions. I'm a bit reluctant to actually apply the first patch though. I only wrote it to demonstrate the point I was trying to make. My understanding -- correct me if I'm wrong -- is that the ALLOCNO_NUM comparisons are simply tie-breakers to stabilise the sort. If the difference between a "index2 - index1" and "index1 - index2" makes a significant difference, moving from one to the other just feels like it's papering over a deeper problem. At least until we know why the difference matters (at which point we might decide there's no easy, acceptable fix). I agree that the lack of an xmm0 tie was the fundamental problem in the particular example I gave. Perhaps I didn't put enough emphasis on that in the original message. I was banging on about the copy heuristics because, in the examples I looked at before, it was the the copy heuristics that were leading us astray for certain ALLOCNO_NUM orderings. And a similar problem did occur here, even if it wasn't the primary cause of the extra copy. I'll look through my logs for an example that doesn't rely on hard register ties. (As I said, it takes quite a while to go through the detailed differences for a particular test, so I'm not sure how quick I'll be.) > > More fundamentally, C3 ends up having more influence than C2, > > even though the copies ostensibly have the same weight. In other words, > > we have the following copy graph: > > > > A58 <-> A63 <-> A64 <-> A60 > > > > And the allocation of A63 has more influence over A64 than A60 does, > > because of the transitive influence of A58. But shouldn't this > > transitive influence only kick in when A64 is allocated _between_ > > A58 and A63? Why should the allocation of A58 matter to A64 if > > A63 has already been allocated? > > > > I'll think about it and try to improve the heuristics. Indirect > influence of allocno assignment to allocno not connected directly to > given allocno was added to improve the code generation. I understand that, and I can see why it's important when the allocation order is something like A58, A64, A60, A63. All other things being equal, it's better to allocate H58 to A64 in that case. The case I'm worried about is when the indirect costs via Ax apply _after_ Ax has been allocated. > With my point of view, the major problem in this case is in that tying > of xmm0 with r64 in insn #16 is ignored. By the way such insns are not > typical for x86 (I did not find any in SPEC2000). First of all cost > calculation in regclass (used by the old register allocator) and in > ira-costs.c (used by IRA) does not take such insns into account. They > take into account only move insn involving hard-register of small > class and pseudo-register. Ideally the cost for class SSE_FIRST_REG > should be minimal and allocno for r64 should have SSE_FIRST_REG as the > best reg class and SSE_REGS as an alternative register class. It is > not done. > > But the patch > > http://gcc.gnu.org/ml/gcc-patches/2008-08/msg02279.html > > should solve the problem. I missed only one thing for this test. > Instead of `only_regs_p' in ira-conflicts.c::process_regs_for_copy, > there should be `only_regs_p && insn !=NULL_RTX' to distinguish > move insn which was already taken into account in ira-costs.c. > With the patch and the small change, r64 gets xmm0 and the problem > disappears. IRA will generate a better code than the > old RA. Thanks! I originally sent the patches as attachments, but they seemed to get inlined. I'll try again with gzipped files, just in case. Richard
compare-order.diff.gz
Description: Binary data
use-df.diff.gz
Description: Binary data