http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55026
--- Comment #3 from Mans Rullgard <mans at mansr dot com> 2012-10-22 19:34:25 UTC --- It has actually got worse over time. With 4.3 I get this: f: sub sp, sp, #8 mov r2, r0 stmia sp, {r0, r1} add r0, r1, r2 add sp, sp, #8 bx lr Here it's only doing the stores, not the loads, and it does them using sp directly. With 4.4 and 4.5 it gets slightly worse: f: sub sp, sp, #8 mov r3, sp stmia r3, {r0, r1} mov r3, r0 add r0, r1, r3 add sp, sp, #8 bx lr Now it's copying sp to another register used as base address in the store. The load is still absent. 4.6 and later produce the code I quoted originally. I can't say I like the trend.