------- Comment #4 from rahul at icerasemi dot com 2009-08-13 15:46 ------- Confirmed. Introducing loop header copy for Os, resolves the problem. On our port, this not only helps move the invariant load outside the loop, but also correctly uses an auto-increment address mode via the AutoInc patches we use. Other examples also confirm that the header copying enables more induction variables to be identified and hence post-increment opportunities.
Does better loop analysis and hence potential for further optimizations outweigh the cost of copying the loop header? It would be ideal to relax the loop header copy predicate for Os and select an appropriate threshold, currently set at 20 insn, a lower value to start with perhaps. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41026