http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58622

            Bug ID: 58622
           Summary: With -fomit-frame-pointer, A64 does not generate
                    post-decrement stores
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: b.grayson at samsung dot com
            Target: AArch64
             Build: 4.9.0 20130602

In A64, if one compiles a simple program under -O3, one gets code like this:

int bar(int i);
int foo() { return bar(5)+4; }

A64 -O3 assembly:

foo:
        stp     x29, x30, [sp, -16]!
        add     x29, sp, 0
        mov     w0, 5
        bl      bar
        add     w0, w0, 4
        ldp     x29, x30, [sp], 16
        ret

Note the use of update-form loads and stores for the SP.

But if one uses -O3 -fomit-frame-pointer, the following is obtained:

foo:
        sub     sp, sp, #16
        mov     w0, 5
        str     x30, [sp]
        bl      bar
        add     w0, w0, 4
        ldr     x30, [sp]
        add     sp, sp, 16
        ret

The sub and str could be merged into str x30, [sp, #-16]!, and the ldr/add
could be merged into ldr x30, [sp], #16 (if I have my assembly correct), as
they were in the with-frame-pointer case.  On some ARM implementations, the
updates are "for free", so one would get better performance with the merged
load/store instructions, not to mention better instruction-cache density.

Note that under A32, identical code (using update/post-decrement stores) is
generated regardless of omit-frame-pointer settings.

Reply via email to