I am doing embedded development on an arm cortex-m processor using
arm-none-eabi-gcc.  I have run into a bug where GDB is showing that the
code executing is code from a function that is not used.  The code is
removed as it is not called, and hence -ffunction-sections -fdata-sections
-Wl,--gc-sections has removed it.
The issue appears that sometime around January 2022 with introduction of
GCC 11 the elf file output changed to including these function symbols and
then mapping them to address zero.

The result is that there are several function symbols mapped to address 0
and then gdb seems to randomly pick which code it thinks is running.  For
example when running a loop like:
int i=100;
while (i>0) {
   i--;
}
GDB might decide the i--; is some other random line in a seemingly random
file/function that is not in the binary image.  As such stepping through
code it will jump between random locations.

I have found that the 10.3.1 toolchain does not have this issue, but every
one I have tried after and including 11.2.1 has this issue.   I seem to
recall around GCC 11 there was a change for -flto and was wondering if this
created this issue?

I have noticed that when I build elf files with both versions of code the
failing version will have errors in the objdump -dlr output.  For example:

Disassembly of section .text:

00000000 <exception_table>:
getStackSize():
D:\Projects\SECA\LoRa\firmware/src/CMSIS/wlr089/source/gcc/startup_wlr089.c:244
   0: ff 7f 00 20 95 04 00 00 5d 05 00 00 9d 05 00 00     ... ....].......
...
getStackUsed():
D:\Projects\SECA\LoRa\firmware/src/CMSIS/wlr089/source/gcc/startup_wlr089.c:254
  2c: 5d 05 00 00 00 00 00 00 00 00 00 00 5d 05 00 00     ]...........]...
D:\Projects\SECA\LoRa\firmware/src/CMSIS/wlr089/source/gcc/startup_wlr089.c:257
  3c: c5 08 00 00 5d 05 00 00 f9 0b 00 00 65 06 00 00     ....].......e...
D:\Projects\SECA\LoRa\firmware/src/CMSIS/wlr089/source/gcc/startup_wlr089.c:250
  4c: 5d 05 00 00 5d 05 00 00 5d 05 00 00 5d 05 00 00     ]...]...]...]...
D:\Projects\SECA\LoRa\firmware/src/CMSIS/wlr089/source/gcc/startup_wlr089.c:267
  5c: 5d 05 00 00 99 07 00 00 bd 07 00 00 e1 07 00 00     ]...............
  6c: 05 08 00 00 29 08 00 00 4d 08 00 00 5d 05 00 00     ....)...M...]...
_ZN10I2C_MASTER4syncEv():
D:\Projects\SECA\LoRa\firmware/src/drivers/i2c_master/i2c_master.cpp:90
  7c: 5d 05 00 00 5d 05 00 00 e9 09 00 00 29 0a 00 00     ]...].......)...
_ZN10I2C_MASTER18setCommandBitsWireEh():
D:\Projects\SECA\LoRa\firmware/src/drivers/i2c_master/i2c_master.cpp:90

In the above the objdump is confused and mixing the getStackSize() function
(is not in binary) with the exception vector table.  As you can see in the
listing above the objdump seems to change which function it is guessing
should be used changing to _ZN10I2C_MASTER4syncEv(): and then
_ZN10I2C_MASTER18setCommandBitsWireEh():

I found this continues into real function and yields weird results when
stepping through code in a debugger (gdb).

I have tried -flto and everything else to remove these unused symbols from
the elf file and nothing seems to work.

Is this a bug in the toolchain?  Is there a possible work around?

Thanks
Trampas

Reply via email to