Let's assume I have two sub-classes of ALL_REGS: BOTTOM_REGS (c0-c15) and TOP_CREGS (c16-c31).
Let's also assume I have two main types of instruction: A-type Instructions, which can use ALL 32 registers, and B-type Instructions, which can only use the 16 BOTTOM_REGS. IRA will correctly calculate costs (in ira-costs.c) for allocnos appearing in B-type instructions, such that TOP_CREGS has a higher cost than BOTTOM_REGS. It will also calculate costs for the A-type instructions such that TOP_CREGS and BOTTOM_REGS have the same cost. The order of coloring will be determined by the algorithm chosen: Priority or Chaitin-Briggs. As each allocno is colored, the costs will be inspected and the best available hard register will be chosen, mainly based on the register class costs mentioned above, so allocnos in B-type Instructions will usually be assigned a BOTTOM_REG if one is free. If two or more hard registers share the same cost then whichever one appears first in the REG_ALLOC_ORDER will be assigned. (If no hard register can be found, the allocno is assigned memory and will require a "reload" in a later pass to get a hard register.) I do not wish to alter the coloring algorithms or the coloring order. I believe they are good at determing the order to color allocnos, which dictates the likelihood of being assigned a hard register. What I wish to change is the hard register that is assigned, given that the coloring order has determined that this allocno should get one next. Why do I need to do this? Since the BOTTOM_REGS can be used by all instructions, it makes sense to put them first in the REG_ALLOC_ORDER, so we minimise the number of registers consumed by a low-pressure function. But it also makes sense, in high-pressure functions, to steer A-type Instructions away from using BOTTOM_REGS so that they are free for B-type Instructions to use. To achieve this, I tried altering the costs calculated in ira-costs.c, either explicitly with various hacks or by altering operand constraints. The problem with this approach was that it is static and independent, occurring before any coloring order has been determined and without any awareness of the needs of other allocnos. I believe I require a dynamic way to alter the costs, based on which allocnos conflict with the allocno currently being colored and which hard registers are still available at this point. The patch I have attached here is my first reasonable successful attempt at this dynamic approach, which has led to performance improvements on some of our benchmarks and no significant regressions. I am hoping it will be useful to others, but I post it more as a talking point or perhaps to inspire others to come up with better solutions and share them with me :-)
bolton_ira_subset_experiments.diff
Description: bolton_ira_subset_experiments.diff