http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55160
Bug #: 55160 Summary: [4.8 Regression] Counterproductive loop induction variable optimization Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: olege...@gcc.gnu.org CC: amyl...@gcc.gnu.org Target: sh*-*-* arm*-*-* Starting with rev 192505 the following int test_04 (int* x, int c) { int s = 0; for (int i = 0; i < c; ++i) s += *--x; return s; } gets compiled to (SH, -O2 -m4 -ml): cmp/pl r5 bf/s .L12 mov #0,r1 mov #0,r0 .L11: add #-4,r4 mov.l @r4,r2 add #1,r1 cmp/eq r5,r1 bf/s .L11 add r2,r0 rts nop .L12: rts mov #0,r0 whereas before (also on 4.7.3) it was: cmp/pl r5 bf/s .L11 mov #0,r0 .L10: add #-4,r4 mov.l @r4,r1 dt r5 bf/s .L10 add r1,r0 rts nop .L11: rts nop In this case the inner loop code size effectively does not increase, but there is overhead in setting up the loop. Similar code is also generated on ARM. Another similar case: int test_03 (int* x, int c) { int s = 0; for (int i = 0; i < c; ++i) s += x[i]; return s; } rev 192505: cmp/pl r5 bf/s .L4 shll2 r5 add r4,r5 mov #0,r0 .L3: mov.l @r4+,r1 cmp/eq r5,r4 bf/s .L3 add r1,r0 rts nop .L4: rts mov #0,r0 before it was: cmp/pl r5 bf/s .L6 mov #0,r0 shll2 r5 add #-4,r5 shlr2 r5 add #1,r5 .L3: mov.l @r4+,r1 dt r5 bf/s .L3 add r1,r0 .L6: rts nop In this case, there was the useless loop setup code. Ideally this should be something like: cmp/pl r5 bf/s .L6 mov #0,r0 .L3: mov.l @r4+,r1 dt r5 bf/s .L3 add r1,r0 .L6: rts nop Jörn, I've added you in CC because your commit (rev 192505) seems to have triggered something there. I'm not sure whether this is actually the cause for this counter productive transformation.