Jeff Law wrote: > On 11/16/09 10:33, Ian Bolton wrote: > > The question is: how to fix this? I have initially put the > REG_ALLOC_ORDER > > back to how it was and changed the operand constraints in our MD > file, > > so each of the "apathetic" instructions, will either take a 't' > (TOP_CREG) > > or '?b' (BOTTOM_REG). The '?' shows that this alternative is > slightly more > > costly than using 't'. On the benchmark that benefitted the most > from > > the new REG_ALLOC_ORDER, these constraints are almost achieving the > same > > thing. It is only "almost" there because I am struggling with how to > show > > two alternatives for loads and stores, which currently have an 'm' > > constraint. > > > I'm not aware of any way to describe this to IRA. In theory I guess > IRA > could be twiddled to use TARGET_ADDRESS_COST to derive some kind of > cost difference based on the registers used in the MEM, but it seems > rather hackish.
I found somewhere in record_address_reg to achieve what I needed: for (k = 0; k < cost_classes_num; k++) { i = cost_classes[k]; pp->cost[k] += (ira_get_may_move_cost (Pmode, i, rclass, true) * scale) / 2; /* Slightly nudge memory addresses away from using BOTTOM_REGS and C_REGS, so they take TOP_CREGS instead - should this pseudo later need BOTTOM_REGS, there will be a higher cost to use TOP_CREGS and it will still get BOTTOM_REGS. This is equivalent to adding a ?b on each instruction that currently has a 'm' constraint. Writing this generically might look something like: pp->cost[k] += TARGET_ADDRESS_EXTRA_COST_P(cost_classes[k]) ? (scale/2) : 0; */ if (cost_classes[k] == BOTTOM_REGS || cost_classes[k] == C_REGS) pp->cost[k] += (scale) / 2; } I was then able to alter all our register-agnostic instructions in our .md file to take either a 't' for TOP_CREGS for a '?b' for BOTTOM_REGS. Initial results showed that IRA was moving input arguments out of their BOTTOM_REGS (e.g. $c1) into TOP_CREGS to do work on them, since it thought TOP_CREGS were less costly to use, despite the cost of the move instruction to get the input argument into a TOP_CREG. I wondered if this was because we add 2 to the alt_cost for each '?' and our REG_MOVE_COST is also 2, but this becomes irrelevant anyway if you do a lot of work with the input argument, since each potential use in a BOTTOM_REG incurs some kind of penalty (one per '?' seen) which will eventually persuade IRA that leaving the input argument in the BOTTOM_REG it arrived in is more costly than moving it to a TOP_CREG. I addressed this problem by splitting my register bank a little differently: instead of making a distinction between BOTTOM_REGS and TOP_CREGS, I made it so there was only a penalty if you used one of the non-argument BOTTOM_REGS (i.e. a callee-save BOTTOM_REG). This meant that IRA was happy to leave input arguments in their BOTTOM_REGS but erred towards using TOP_CREGS once the caller-save BOTTOM_REGS had run out. This was an improvement, but there was still a case where these '?' penalties were not aligned with reality: T1 = A + B; // can use any register, TOP_CREGS appears cheaper T2 = A - C; // can use any register, TOP_CREGS appears cheaper T3 = A & D; // must use BOTTOM_REGS The constraints for the first two instructions show that TOP_CREGS is cheaper, but then you have to plant a move to get A into a BOTTOM_REG to do the AND; in reality, we know it cheaper to have A in a BOTTOM_REG all along, but the '?' constraint suggests there is a cost in doing this for the ADD and SUB and so IRA will put A in a TOP_CREG at first and incur the cost of the move because it is still cheaper than the costs I have defined in with my constraints. I don't believe there is a way to communicate a conditional cost, so I'm thinking that constraints are not the solution for me at this time. What are your thoughts? > You might try something like this: > > 1. Crank up the callee-saved register cost adjustment in > assign_hard_reg so that it's scaled based on REG_FREQ. That will > probably lead to some regressions based on my experiments. > > 2. Then add a check that completely avoids the cost adjustment in > cases where we pushed a MAY_SPILL_P allocno. This was on my todo list, > but I haven't got to it yet. > > If you wanted to get fancy, you could track the maximum number of > neighbors in each class as allocnos are pushed and use that to adjust > how many registers are cost adjusted in assign_hard_reg. The idea > being > the more neighbors the allocno has, the more callee-saved regsiters > we're likely to need. > > You could also try to account for the fact that once allocated, the > callee saved regsiter is available "for free" to non-conflicting > allocnos. So you could iterate over those and decrease the penalty > for > using a callee saved register on the current allocno. Given the > interfaces provided by IRA, this could be compile-time expensive. > Your "small improvement to IRA allocations" patch posted yesterday achieves something very similar to this, right? In the case of our BOTTOM_REGS and TOP_CREGS split, whether this pays off depends on which allocno is colored first. If we color the one that needs BOTTOM_REGS first then we will later reuse that BOTTOM_REG to do work that can be done in any register. But if we color the one that can be done in any register first, my REG_ALLOC_ORDER is giving out TOP_CREGS first, which then can't be reused later to do the BOTTOM_REGS work. I guess when we are coloring an agnostic allocno that will take any reg, we could check if any of the non-conflicting allocnos want a BOTTOM_REG and therefore give out a BOTTOM_REG since it can be reused later. Alternatively, we could walk over all conflicting allocnos when coloring this one and check for how many conflicting allocnos require BOTTOM_REGS and make sure we don't give one out now if we will otherwise run out. Obviously, this will make things even more expensive for compilation, but I'm not sure I can get what I require without some kind of "look-ahead" mechanism like in the two ideas above. Unless I simply change the coloring algorithm so that we always color those that need BOTTOM_REGS first? I think this is effectively what I was achieving with my early hacks involving the cost adjustments of 6 and 20. Thanks for reading, Ian