http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58622
Bug ID: 58622 Summary: With -fomit-frame-pointer, A64 does not generate post-decrement stores Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: b.grayson at samsung dot com Target: AArch64 Build: 4.9.0 20130602 In A64, if one compiles a simple program under -O3, one gets code like this: int bar(int i); int foo() { return bar(5)+4; } A64 -O3 assembly: foo: stp x29, x30, [sp, -16]! add x29, sp, 0 mov w0, 5 bl bar add w0, w0, 4 ldp x29, x30, [sp], 16 ret Note the use of update-form loads and stores for the SP. But if one uses -O3 -fomit-frame-pointer, the following is obtained: foo: sub sp, sp, #16 mov w0, 5 str x30, [sp] bl bar add w0, w0, 4 ldr x30, [sp] add sp, sp, 16 ret The sub and str could be merged into str x30, [sp, #-16]!, and the ldr/add could be merged into ldr x30, [sp], #16 (if I have my assembly correct), as they were in the with-frame-pointer case. On some ARM implementations, the updates are "for free", so one would get better performance with the merged load/store instructions, not to mention better instruction-cache density. Note that under A32, identical code (using update/post-decrement stores) is generated regardless of omit-frame-pointer settings.