------- Comment #9 from sandra at codesourcery dot com 2010-06-12 07:42 ------- I now have a specific theory of what is going on here. There are two problems:
(1) estimate_reg_pressure_cost is not accounting for the function call in the loop body. In this case it ought to use call_used_regs instead of fixed_regs to determine how many registers are available for loop invariants. Here the target is Thumb-1 and there are only 4 non-call-clobbered registers available rather than 9, so we are much more constrained than ivopts thinks we are. This is pretty straightforward to fix. (2) For the test case filed with the issue, there are 4 registers needed for the two candidates and two invariants ivopts is selecting, so even with the fix for (1) ivopts thinks it has enough registers available. But, there are two uses of the form (src + offset) in the ivopts output, although they appear differently in the gimple code. RTL optimizations are combining these and allocating a temporary. Since the two uses span the function call in the loop body, the temporary needs to be assigned to a non-call-clobbered register. This is why there is a spill of the other loop invariant. Perhaps we could make the RA smarter about recomputing the src + offset value rather than resort to spilling something, but since I am dumb about the RA ;-) I'm planning to keep poking at the ivopts cost model instead. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42505