Hello,

I have some code generating the following assembly:
{OnReset}:
 8000010:       b508            push    {r3, lr}
 8000012:       20ff            movs    r0, #255        ; 0xff
 8000014:       f000 f828       bl      8000068 <{MyFunction}>
 8000018:       e7fe            b.n     8000018 <{OnReset}+0x8>
 800001a:       bf00            nop

08000068
{MyFunction}:
 8000068:       f44f 5380       mov.w   r3, #4096       ; 0x1000
 800006c:       f2c2 0300       movt    r3, #8192       ; 0x2000
 8000070:       7018            strb    r0, [r3, #0]
 8000072:       4770            bx      lr

"MyFunction" and "OnReset" are in different source files and therefore compiled to different object files. I would like to get "MyFunction" fully inlined to "OnReset" to remove the extra branch instructions (bl and bx).

It's my understanding that because the two functions are compiled into separate object files, this must be done using LTO. If I compile them into the same object file, I get the full inlining I'm looking for, but that's not going to scale well for my project.

** Beautiful, isn't it? **
{OnReset}:
 8000010:       f44f 5380       mov.w   r3, #4096       ; 0x1000
 8000014:       f2c2 0300       movt    r3, #8192       ; 0x2000
 8000018:       22ff            movs    r2, #255        ; 0xff
 800001a:       701a            strb    r2, [r3, #0]
 800001c:       e7fe            b.n     800001c <{OnReset}+0xc>
 800001e:       bf00            nop


I've tried adding -flto to my compiler and linker flags and a number of other things without success. The compiler seems to generate extra information in my object files, but the linker doesn't seem to do the optimization. I don't get any ICEs, however, as stated in Bug 61 and 88. I just don't get the result I'm after.

Here are my compiler commands:
arm-none-eabi-gdc -mthumb -mcpu=cortex-m4 -fno-emit-moduleinfo -ffunction-sections -fdata-sections -O3 -c -flto ... arm-none-eabi-ld -T link/link.ld -Map binary/memory.map --gc-sections -flto ...

I'm using my arm-none-eabi cross toolchain built from the GDC 4.8 branch. I tried adding --enable-lto to my toolchain's configure, but that had no effect. It's my understanding that it's enabled by default anyway.

Does anyone know how I can get this level of inlining without compiling all my source into one object file?

Thanks for any help,
Mike

Reply via email to