Ian Bolton wrote: > Vladimir Makarov wrote: > > Ian Bolton wrote: > > > Yesterday, I wrote: > > > > > >> BTW, today I have achieved some good results with existing IRA by > > doing > > >> the following: > > >> > > >> 1) Changed the REG_ALLOC_ORDER so that TOP_CREGS are given out > > before > > >> BOTTOM_REGS. My previous hacked version worked by increasing the > > >> priority of those that wanted BOTTOM_REGS, so they got first pick; > > this > > >> new version makes them wait their turn, but ensures those with > > higher > > >> priority take TOP_CREGS before BOTTOM_REGS. > > >> > > >> The analogy, I think, is of giving out meals on an airplane. Most > > >> people will eat meat or vegetarian meals, whereas vegetarians only > > >> want vegetarian meals. My hacked version was effectively allowing > > >> the vegetarians to push to the front of the queue and get their > > meals; > > >> this new version works by encouraging the meat eaters to eat the > > meaty > > >> meals first, leaving more suitable meals for the vegetarians > further > > >> down the plane. > > >> > > >> 2) I have forced propagation of allocnos to parent regions with > the > > >> following hack in find_allocno_class_costs(): > > >> > > >> { > > >> /* Propagate costs to upper levels in the region > > >> tree. */ > > >> parent_a_num = ALLOCNO_NUM (parent_a); > > >> for (k = 0; k < cost_classes_num; k++) > > >> COSTS_OF_ALLOCNO (total_costs, parent_a_num)->cost[k] > > >> += COSTS_OF_ALLOCNO (total_costs, a_num)->cost[k]; > > >> COSTS_OF_ALLOCNO (total_costs, parent_a_num)->mem_cost > > >> += COSTS_OF_ALLOCNO (total_costs, a_num)->mem_cost; > > >> /* BEGIN IGB-IRA CHANGE 2 */ > > >> /* Force total_costs to propagate upwards, by setting > > >> allocno_costs to be total_costs */ > > >> for (k = 0; k < cost_classes_num; k++) > > >> COSTS_OF_ALLOCNO (allocno_costs, parent_a_num)->cost[k] > > >> = COSTS_OF_ALLOCNO (total_costs, parent_a_num)->cost[k]; > > >> COSTS_OF_ALLOCNO (allocno_costs, parent_a_num)->mem_cost > > >> = COSTS_OF_ALLOCNO (total_costs, parent_a_num)->mem_cost; > > >> /* END IGB-IRA CHANGE 2 */ > > >> } > > >> > > >> I don't know why propagation isn't happening normally, but > > >> this total_costs hack achieves the same thing for me at the > > >> moment. Is there any information I could provide to help > > >> someone tell me why propagation isn't happening? > > >> > > > > > > Good news! I have been able to remove my "total_costs" hack > > > above by instead commenting out the following line from ira_build() > > > in ira-build.c: > > > > > > remove_unnecessary_regions (false); > > > > > > For my situation at least, this function is preventing the > > > propagation of useful allocno information from region 1 to > > > region 0. Without my change, the allocation for a pseudo in > > > region 0 is not aware of how that pseudo will be used inside > > > a loop in region 1; the real impact of this is that we need > > > to then do a register move *inside the loop* into a valid > > > register class for the instruction in region 1. > > > > > > I do not believe this will impact anyone with a regular > > > register set, but for any architecture where some instructions > > > can only use a subset of the register bank, I believe that > > > all regions are always necessary, since cost information > > > for each allocno is relevant and important. > > > > > > I still need to do some more testing in regards this "fix", > > > but I wanted to put my findings out there as soon as possible > > > for comment from the experts. > > > > > The function (remove_unnecessary_regions) is used to increase RA > speed > > by decreasing number of regions. The region is removed if the > register > > pressure is not high in the region in other words if the probability > to > > spill pseudos on the region border to improve RA in the region is > > small. > > > > But when the region is removed and correspondingly allocnos (see > > function remove_unecessary_allocnos) in the region the info including > > hard register costs is propagated to the corresponding allocnos in > the > > parent region (see function propagate_some_info_from_allocno). > > > > I've just checked the code it looks ok to me. Ian, it would be nice > if > > you figure out why the propagation does not happen. As Jeff wrote, > > fixing that would be important practically for any target. > > (I'm in the middle of a hefty compile at the moment, so I can't do any > more debugging yet, but I figured I'd think out loud on this list in > the > mean time.) > > At first, I thought the problem was that > propagate_some_info_from_allocno() > appears to only pass up information about the cover class, as opposed > to every cost_class (which was what my total_costs hack did). However, > I see > that the propagate_allocno_info() function, which is now being called > when > I comment out remove_unnecessary_regions(), also only passes up > information > about the cover class, so I don't think it's propagation that's the > issue. > > Looking at ira_build(), I see that there is a call to create_caps() > just > after the call to propagate_allocno_info() - both of which were guarded > by > the condition more_one_region_p(), which is only true when I comment > out > the call to remove_unnecessary_regions(). Looking at create_caps(), I > see > that a Cap will be created for each of my region 1 allocnos. Looking > further > down in the ira_build() function, there is a call to > ira_build_conflicts(), > which makes a lot of references to Caps. Perhaps it's the existence of > Caps > that matters? > > I see in my benchmark that there are some unfortunate allocnos that > require > BOTTOM_REGS, but they only live in Region 1. Am I right in thinking > that, > without the caps, there is no way for them to get their voice heard > during > the allocation for Region 0? > > Maybe, when an irregular register set exists, whether there is high > pressure > in the region or not, the Caps should be made to ensure that you don't > end > up with move instructions happening within the loop?
I've just changed my code to turn back on the remove_unnecessary_regions function and ... I fear I have taken us all on a wild goose chase! It seems that my alternate REG_ALLOC_ORDER, where TOP_CREGS are given out before BOTTOM_REGS, is enough on its own to get the allocation right! I didn't notice this because I added the REG_ALLOC_ORDER change as the final piece of the puzzle after the total_costs hack. So I guess the key for any irregular architectures is to hold back the special registers where possible, so the instructions that need them can get them by showing they have a higher cost for the non-special registers. Otherwise, IRA cannot do a good job because the preferences of lower priority instructions that want BOTTOM_REGS will be irrelevant if higher priority instructions (such as those working on memory addresses) that just want any register have ignorantly taken those BOTTOM_REGS already. I must admit that I have not been looking at the CB algorithm recently though, so further work may still be needed, although some tests I just ran confirmed that CB and Priority algorithm got the same performance. I may also need to tweak our REGISTER_MOVE_COST and MEMORY_MOVE_COST. Thanks for all your help so far.