Let's assume I have two sub-classes of ALL_REGS: BOTTOM_REGS (c0-c15)
and TOP_CREGS (c16-c31).

Let's also assume I have two main types of instruction: A-type
Instructions, which can use ALL 32 registers, and B-type Instructions,
which can only use the 16 BOTTOM_REGS.

IRA will correctly calculate costs (in ira-costs.c) for allocnos
appearing in B-type instructions, such that TOP_CREGS has a higher
cost than BOTTOM_REGS.  It will also calculate costs for the A-type
instructions such that TOP_CREGS and BOTTOM_REGS have the same cost.

The order of coloring will be determined by the algorithm chosen:
Priority or Chaitin-Briggs.  As each allocno is colored, the costs
will be inspected and the best available hard register will be chosen,
mainly based on the register class costs mentioned above, so allocnos
in B-type Instructions will usually be assigned a BOTTOM_REG if one is
free.  If two or more hard registers share the same cost then
whichever one appears first in the REG_ALLOC_ORDER will be assigned.
(If no hard register can be found, the allocno is assigned memory and
will require a "reload" in a later pass to get a hard register.)

I do not wish to alter the coloring algorithms or the coloring order.
I believe they are good at determing the order to color allocnos,
which dictates the likelihood of being assigned a hard register.  What
I wish to change is the hard register that is assigned, given that the
coloring order has determined that this allocno should get one next.

Why do I need to do this?  Since the BOTTOM_REGS can be used by all
instructions, it makes sense to put them first in the REG_ALLOC_ORDER,
so we minimise the number of registers consumed by a low-pressure
function.  But it also makes sense, in high-pressure functions, to
steer A-type Instructions away from using BOTTOM_REGS so that they are
free for B-type Instructions to use.

To achieve this, I tried altering the costs calculated in ira-costs.c,
either explicitly with various hacks or by altering operand
constraints.  The problem with this approach was that it is static and
independent, occurring before any coloring order has been determined
and without any awareness of the needs of other allocnos.  I believe I
require a dynamic way to alter the costs, based on which allocnos
conflict with the allocno currently being colored and which hard
registers are still available at this point.

The patch I have attached here is my first reasonable successful
attempt at this dynamic approach, which has led to performance
improvements on some of our benchmarks and no significant
regressions.

I am hoping it will be useful to others, but I post it more as a
talking point or perhaps to inspire others to come up with better
solutions and share them with me :-)

Attachment: bolton_ira_subset_experiments.diff
Description: bolton_ira_subset_experiments.diff

Reply via email to