Ian Bolton wrote:
> Vladimir Makarov wrote:
> > Ian Bolton wrote:
> > > Yesterday, I wrote:
> > >
> > >> BTW, today I have achieved some good results with existing IRA by
> > doing
> > >> the following:
> > >>
> > >> 1) Changed the REG_ALLOC_ORDER so that TOP_CREGS are given out
> > before
> > >> BOTTOM_REGS.  My previous hacked version worked by increasing the
> > >> priority of those that wanted BOTTOM_REGS, so they got first pick;
> > this
> > >> new version makes them wait their turn, but ensures those with
> > higher
> > >> priority take TOP_CREGS before BOTTOM_REGS.
> > >>
> > >> The analogy, I think, is of giving out meals on an airplane.  Most
> > >> people will eat meat or vegetarian meals, whereas vegetarians only
> > >> want vegetarian meals.  My hacked version was effectively allowing
> > >> the vegetarians to push to the front of the queue and get their
> > meals;
> > >> this new version works by encouraging the meat eaters to eat the
> > meaty
> > >> meals first, leaving more suitable meals for the vegetarians
> further
> > >> down the plane.
> > >>
> > >> 2) I have forced propagation of allocnos to parent regions with
> the
> > >> following hack in find_allocno_class_costs():
> > >>
> > >> {
> > >>   /* Propagate costs to upper levels in the region
> > >>      tree.  */
> > >>   parent_a_num = ALLOCNO_NUM (parent_a);
> > >>   for (k = 0; k < cost_classes_num; k++)
> > >>     COSTS_OF_ALLOCNO (total_costs, parent_a_num)->cost[k]
> > >>       += COSTS_OF_ALLOCNO (total_costs, a_num)->cost[k];
> > >>   COSTS_OF_ALLOCNO (total_costs, parent_a_num)->mem_cost
> > >>     += COSTS_OF_ALLOCNO (total_costs, a_num)->mem_cost;
> > >>   /* BEGIN IGB-IRA CHANGE 2 */
> > >>   /* Force total_costs to propagate upwards, by setting
> > >>      allocno_costs to be total_costs */
> > >>   for (k = 0; k < cost_classes_num; k++)
> > >>     COSTS_OF_ALLOCNO (allocno_costs, parent_a_num)->cost[k]
> > >>       = COSTS_OF_ALLOCNO (total_costs, parent_a_num)->cost[k];
> > >>     COSTS_OF_ALLOCNO (allocno_costs, parent_a_num)->mem_cost
> > >>       = COSTS_OF_ALLOCNO (total_costs, parent_a_num)->mem_cost;
> > >>   /* END IGB-IRA CHANGE 2 */
> > >> }
> > >>
> > >> I don't know why propagation isn't happening normally, but
> > >> this total_costs hack achieves the same thing for me at the
> > >> moment.  Is there any information I could provide to help
> > >> someone tell me why propagation isn't happening?
> > >>
> > >
> > > Good news!  I have been able to remove my "total_costs" hack
> > > above by instead commenting out the following line from ira_build()
> > > in ira-build.c:
> > >
> > >   remove_unnecessary_regions (false);
> > >
> > > For my situation at least, this function is preventing the
> > > propagation of useful allocno information from region 1 to
> > > region 0.  Without my change, the allocation for a pseudo in
> > > region 0 is not aware of how that pseudo will be used inside
> > > a loop in region 1; the real impact of this is that we need
> > > to then do a register move *inside the loop* into a valid
> > > register class for the instruction in region 1.
> > >
> > > I do not believe this will impact anyone with a regular
> > > register set, but for any architecture where some instructions
> > > can only use a subset of the register bank, I believe that
> > > all regions are always necessary, since cost information
> > > for each allocno is relevant and important.
> > >
> > > I still need to do some more testing in regards this "fix",
> > > but I wanted to put my findings out there as soon as possible
> > > for comment from the experts.
> > >
> > The function (remove_unnecessary_regions) is used to increase RA
> speed
> > by decreasing number of regions.  The region is removed if the
> register
> > pressure is not high in the region in other words if the probability
> to
> > spill pseudos on the region border to improve RA in the region is
> > small.
> >
> > But when the region is removed and correspondingly allocnos (see
> > function remove_unecessary_allocnos) in the region the info including
> > hard register costs is propagated to the  corresponding allocnos in
> the
> > parent region (see function propagate_some_info_from_allocno).
> >
> > I've just checked the code it looks ok to me.  Ian, it would be nice
> if
> > you figure out why the propagation does not happen.   As Jeff wrote,
> > fixing that would be important practically for any target.
> 
> (I'm in the middle of a hefty compile at the moment, so I can't do any
> more debugging yet, but I figured I'd think out loud on this list in
> the
> mean time.)
> 
> At first, I thought the problem was that
> propagate_some_info_from_allocno()
> appears to only pass up information about the cover class, as opposed
> to every cost_class (which was what my total_costs hack did).  However,
> I see
> that the propagate_allocno_info() function, which is now being called
> when
> I comment out remove_unnecessary_regions(), also only passes up
> information
> about the cover class, so I don't think it's propagation that's the
> issue.
> 
> Looking at ira_build(), I see that there is a call to create_caps()
> just
> after the call to propagate_allocno_info() - both of which were guarded
> by
> the condition more_one_region_p(), which is only true when I comment
> out
> the call to remove_unnecessary_regions().  Looking at create_caps(), I
> see
> that a Cap will be created for each of my region 1 allocnos.  Looking
> further
> down in the ira_build() function, there is a call to
> ira_build_conflicts(),
> which makes a lot of references to Caps.  Perhaps it's the existence of
> Caps
> that matters?
> 
> I see in my benchmark that there are some unfortunate allocnos that
> require
> BOTTOM_REGS, but they only live in Region 1.  Am I right in thinking
> that,
> without the caps, there is no way for them to get their voice heard
> during
> the allocation for Region 0?
> 
> Maybe, when an irregular register set exists, whether there is high
> pressure
> in the region or not, the Caps should be made to ensure that you don't
> end
> up with move instructions happening within the loop?

I've just changed my code to turn back on the remove_unnecessary_regions
function and ... I fear I have taken us all on a wild goose chase!

It seems that my alternate REG_ALLOC_ORDER, where TOP_CREGS are given out
before BOTTOM_REGS, is enough on its own to get the allocation right!  I
didn't notice this because I added the REG_ALLOC_ORDER change as the final
piece of the puzzle after the total_costs hack.

So I guess the key for any irregular architectures is to hold back the
special registers where possible, so the instructions that need them can
get them by showing they have a higher cost for the non-special registers.

Otherwise, IRA cannot do a good job because the preferences of lower
priority instructions that want BOTTOM_REGS will be irrelevant if higher
priority instructions (such as those working on memory addresses) that
just want any register have ignorantly taken those BOTTOM_REGS already.

I must admit that I have not been looking at the CB algorithm recently
though, so further work may still be needed, although some tests I just
ran confirmed that CB and Priority algorithm got the same performance.

I may also need to tweak our REGISTER_MOVE_COST and MEMORY_MOVE_COST.

Thanks for all your help so far.

Reply via email to