https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102534
Bug ID: 102534 Summary: RFE epilog is not reliably a statement Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: woodard at redhat dot com Target Milestone: --- Created attachment 51523 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51523&action=edit demonstration program Given a program like this: 1 #include <stdio.h> 2 3 static void do_print(char *s) 4 { 5 printf("%s", s); 6 } 7 8 int main(int argc, char *argv[]) 9 { 10 int i = 0; 11 for (;;) { 12 do_print(argv[i]); 13 i++; 14 if (argv[i] == NULL) { 15 do_print("\n"); 16 return 0; 17 } 18 do_print(", "); 19 } 20 #ifdef BAD_MATT_CODE 21 //no longer used 22 return -1; 23 } 24 #endif 25 } 26 27 /** 28 * Just a comment taking a few lines of code 29 * What do you call cheese that isn't yours? 30 * Nacho cheese. 31 **/ 32 int unused_variable; 33 34 void unused_function() 35 { 36 printf("I'm not called anywhere\n"); 37 } When you try to set a breakpoint on the closing brace of a function, it skips to the beginning of the next function in the file: $ gdb a.out GNU gdb (GDB) Fedora 10.2-3.fc34 Reading symbols from a.out... (gdb) break 6 Breakpoint 1 at 0x401060: file line-range.c, line 9. That is the start of main() which is not what was intended which I would assert means “break before you leave the context of do_print()” in other words the epilog of the function. On the other hand “b 5” works as expected (gdb) break 5 Breakpoint 2 at 0x401070: /home/ben/Shared/test/line-ranges/line-range.c:5. (3 locations) (gdb) info break Num Type Disp Enb Address What 1 breakpoint keep y 0x0000000000401060 in main at line-range.c:9 2 breakpoint keep y <MULTIPLE> 2.1 y 0x0000000000401070 in do_print at line-range.c:5 2.2 y 0x0000000000401081 in do_print at line-range.c:5 2.3 y 0x000000000040109a in do_print at line-range.c:5 In that particular case the function ends up being inlined and the "ret" instruction where the epilog is_stmt would be has been eliminated. We believe it should still mark the first instruction after the code from the function. Where their would have been a ret. It isn't just inline functions are affected. (gdb) break 25 Breakpoint 1 at 0x4011a0: file line-range.c, line 35. Once again this the first instruction of the function defined after the one where the epilog for main should be. There is even a ret instruction there: (gdb) disassemble main Dump of assembler code for function main: 0x0000000000401060 <+0>: push %rbx 0x0000000000401061 <+1>: mov %rsi,%rbx 0x0000000000401064 <+4>: jmp 0x401081 <main+33> 0x0000000000401066 <+6>: nopw %cs:0x0(%rax,%rax,1) 0x0000000000401070 <+16>: mov $0x402013,%esi 0x0000000000401075 <+21>: mov $0x402010,%edi 0x000000000040107a <+26>: xor %eax,%eax 0x000000000040107c <+28>: call 0x401050 <printf@plt> 0x0000000000401081 <+33>: mov (%rbx),%rsi 0x0000000000401084 <+36>: xor %eax,%eax 0x0000000000401086 <+38>: mov $0x402010,%edi 0x000000000040108b <+43>: add $0x8,%rbx 0x000000000040108f <+47>: call 0x401050 <printf@plt> 0x0000000000401094 <+52>: cmpq $0x0,(%rbx) 0x0000000000401098 <+56>: jne 0x401070 <main+16> 0x000000000040109a <+58>: mov $0xa,%edi 0x000000000040109f <+63>: call 0x401030 <putchar@plt> 0x00000000004010a4 <+68>: xor %eax,%eax 0x00000000004010a6 <+70>: pop %rbx 0x00000000004010a7 <+71>: ret End of assembler dump. it even has linemap entries $ readelf --debug-dump=decodedline a.out | egrep ^File\|25 File name Line number Starting address View Stmt line-range.c 25 0x4010a4 2 line-range.c 25 0x4010a8 But the problem seems to be that none of the linemap entries is adorned with is_stmt. We believe that that should point at 4010a7. Putting the is-stmt for the closing brace of a functopm on the ret instruction of normal extern function is easy but we would like all the other complicating situations to be handled as well some of which include: - inline functions - void functions - multiple returns from a function - functions which optimize into being empty. - external functions that are not used but could be called from another function in a different CU. However that means that they could be dropped when compiling with LTO. Several of these complications are demonstrated in the attached program. We have found that this works a bit better for C++ rather than C because C++ frequently has code for destructors for local variables that are executed in the epilog of the function and the statement that it is tied to is the closing brace of the function.