https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116329

            Bug ID: 116329
           Summary: Arm M0+ doesn't do tail-call optimization
           Product: gcc
           Version: 13.3.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: terrygreeniaus at gmail dot com
  Target Milestone: ---

Godbolt link: https://godbolt.org/z/9vMTzx4dq

Building with -mcpu=cortex-m0plus on gcc 13.3.1 shows that gcc doesn't perform
tail-call optimization:

    #include <stdint.h>

    uint32_t x;

    void __attribute__((noinline)) foo()
    {
        x = 1;
    }

    void bar()
    {
        foo();
    }

Disassembles as:

    foo():
            movs    r2, #1
            ldr     r3, .L3
            str     r2, [r3]
            bx      lr
    .L3:
            .word   .LANCHOR0
    bar():
            push    {r4, lr}
            bl      foo()
            pop     {r4, pc}
    x:
            .space  4

Compiling with -mcpu=cortex-m4 does the right thing:


    foo():
            ldr     r3, .L3
            movs    r2, #1
            str     r2, [r3]
            bx      lr
    .L3:
            .word   .LANCHOR0
    bar():
            b       foo()
    x:
            .space  4

I purposely made the code not just trivially call an extern function in case
there was an issue with M0+ not having wide enough instructions to just branch
anywhere; in this contrived example it only needs to branch back a little bit
so should have no problem with the direct branch.

I observed this with arm-none-eabi-gcc 13.3.1, but also experimenting in
Godbolt shows that it exists in ARM GCC trunk.

Reply via email to