On Tue, Oct 15, 2019 at 9:58 PM Luis Machado <luis.mach...@linaro.org> wrote:
>
> Hi,
>
> I'd like to get some feedback from the compiler's side before
> implementing a fix for this line numbering problem. I also want to make
> sure i fix it in the right tool.
>
> This is related to this bug report in GDB's bugzilla:
> https://sourceware.org/bugzilla/show_bug.cgi?id=21221
>
> It deals with the cases where we have loops with empty bodies, empty
> headers (for loops) or that simply were written in a single line. This
> causes GCC to not emit line transitions in one way or another. As a
> consequence, GDB won't see the line transition and will continuously
> attempt to step/next until it sees one.
>
> For the end user it appears GDB is stuck in a particular loop, with most
> of them hitting ctrl-C to interrupt it. In reality GDB is making
> progress in the loop, but it will only stop once it goes out of the
> loop, where it will see a line transition.
>
> For the sake of reducing the scope of the problem, I'll assume the loops
> are written across multiple lines and that we're interested in O0
> debugging. Higher optimization levels would probably reshape the loop or
> reduce it to a single instruction in some cases.
>
> Take for example the case of BZ #21221...
>
> int main (void)
> {
>    while (1)
>    {
> 5    for (unsigned int i = 0U; i < 0xFFFFFU; i++)
> 6    {
> 7       ;
> 8    }
>    }
> }
>
> GCC generates the following code:
>
>     0x00000000000005fa <+0>:    push   %rbp
>     0x00000000000005fb <+1>:    mov    %rsp,%rbp
>     0x00000000000005fe <+4>:    movl   $0x0,-0x4(%rbp)
>     0x0000000000000605 <+11>:   jmp    0x60b <main+17>
>     0x0000000000000607 <+13>:   addl   $0x1,-0x4(%rbp)
>     0x000000000000060b <+17>:   cmpl   $0xffffe,-0x4(%rbp)
>     0x0000000000000612 <+24>:   jbe    0x607 <main+13>
>     0x0000000000000614 <+26>:   jmp    0x5fe <main+4>
>
> And the line table looks like this:
>
>   Line Number Statements:
>    [0x00000047]  Extended opcode 2: set Address to 0x5fa
>    [0x00000052]  Special opcode 6: advance Address by 0 to 0x5fa and
> Line by 1 to 2
>    [0x00000053]  Special opcode 64: advance Address by 4 to 0x5fe and
> Line by 3 to 5
>    [0x00000054]  Extended opcode 4: set Discriminator to 3
>    [0x00000058]  Set is_stmt to 0
>    [0x00000059]  Special opcode 131: advance Address by 9 to 0x607 and
> Line by 0 to 5
>    [0x0000005a]  Extended opcode 4: set Discriminator to 1
>    [0x0000005e]  Special opcode 61: advance Address by 4 to 0x60b and
> Line by 0 to 5
>    [0x0000005f]  Special opcode 131: advance Address by 9 to 0x614 and
> Line by 0 to 5
>    [0x00000060]  Advance PC by 2 to 0x616
>    [0x00000062]  Extended opcode 1: End of Sequence
>
> GCC doesn't generate any code or line number transitions for the empty
> loop body, therefore GDB keeps cycling inside this loop, in line 5.
>
> Clang, on the other hand, seems to be a bit smarter about this and will
> generate a dummy jump to help the debugger.
>
> Here's Clang's code:
>
>     0x00000000004004a0 <+0>:    push   %rbp
>     0x00000000004004a1 <+1>:    mov    %rsp,%rbp
>     0x00000000004004a4 <+4>:    movl   $0x0,-0x4(%rbp)
>     0x00000000004004ab <+11>:   movl   $0x0,-0x8(%rbp)
>     0x00000000004004b2 <+18>:   cmpl   $0xfffff,-0x8(%rbp)
>     0x00000000004004b9 <+25>:   jae    0x4004d2 <main+50>
> X   0x00000000004004bf <+31>:   jmpq   0x4004c4 <main+36>
> X   0x00000000004004c4 <+36>:   mov    -0x8(%rbp),%eax
>     0x00000000004004c7 <+39>:   add    $0x1,%eax
>     0x00000000004004ca <+42>:   mov    %eax,-0x8(%rbp)
>     0x00000000004004cd <+45>:   jmpq   0x4004b2 <main+18>
>     0x00000000004004d2 <+50>:   jmpq   0x4004ab <main+11>
>
> X marks the spot where a dummy jump was inserted to aid the debugger.
> The line table looks like this:
>
>   Line Number Statements:
>    [0x00000070]  Extended opcode 2: set Address to 0x4004a0
>    [0x0000007b]  Special opcode 6: advance Address by 0 to 0x4004a0 and
> Line by 1 to 2
>    [0x0000007c]  Set column to 23
>    [0x0000007e]  Set prologue_end to true
>    [0x0000007f]  Special opcode 162: advance Address by 11 to 0x4004ab
> and Line by 3 to 5
>    [0x00000080]  Set column to 33
>    [0x00000082]  Set is_stmt to 0
>    [0x00000083]  Special opcode 103: advance Address by 7 to 0x4004b2
> and Line by 0 to 5
>    [0x00000084]  Set column to 5
>    [0x00000086]  Set is_stmt to 1
>    [0x00000087]  Special opcode 103: advance Address by 7 to 0x4004b9
> and Line by 0 to 5
> X  [0x00000088]  Special opcode 92: advance Address by 6 to 0x4004bf and
> Line by 3 to 8
> X  [0x00000089]  Set column to 46
>    [0x0000008b]  Special opcode 72: advance Address by 5 to 0x4004c4 and
> Line by -3 to 5
>    [0x0000008c]  Set column to 5
>    [0x0000008e]  Set is_stmt to 0
>    [0x0000008f]  Special opcode 131: advance Address by 9 to 0x4004cd
> and Line by 0 to 5
>    [0x00000090]  Set column to 3
>    [0x00000092]  Set is_stmt to 1
>    [0x00000093]  Special opcode 73: advance Address by 5 to 0x4004d2 and
> Line by -2 to 3
>    [0x00000094]  Advance PC by 5 to 0x4004d7
>    [0x00000096]  Extended opcode 1: End of Sequence
>
> Again, X marks the spot where we tell the debugger there is a line
> transition (from line 5 to line 8), and so step/next execution should end.
>
> I'm inclined to say we should fix this in GCC in a similar way. GDB
> relies on the line table information since it can't correctly tell when
> we have transitioned to a new source line by looking just at the
> instruction stream.
>
> My idea is to create a dummy jump (gimple sounds more appropriate) with
> the source location of the last line of the loop body (in this case line
> number 8). That would trigger the creation of a new line table entry,
> making GDB happy.
>
> Is there a better way to force the compiler to output such a line table
> transition without having to resort to a dummy jump? Is there a safer
> way to add such transitions without worrying about the optimizer getting
> rid of them later on? Should we even worry about preserving such
> information for higher optimization levels?
>
> I'll also need a way to store the source location of the last line of
> the loop body, since closing braces and friends are ignored by GCC for
> code generation purposes. We just consume those tokens without second
> thought.
>
> There are other interesting variations, like the following:
>
> int main(void)
> {
> int var = 0;
>
>    for (;;)
>    {
> 7    var++;
> 8  }
>
> return 0;
> }
>
> In the case above, the debugger gets stuck in line 7. With the proposed
> solution it would transition to line 8 and then return to line 7.
>
> Another case is this one:
>
> int main (void)
> {
>    while (1)
>    {
> 5    for (unsigned int i = 0U; i < 0xFFFFFU; i++)
> 6       ;
>    }
> }
>
> Similarly, GDB gets stuck in line 5. With the proposed fix, it would
> transition to line 6 before returning to line 5.
>
> Feedback would be greatly appreciated.

I think that adding an extra jump is unwanted.  Instead - if you disregard
the single-source-line case - there's always the jump and the label we jump
to which might/should get different source locations.  Like in one of the above
cases:

main ()
{
  int D.1803;

  [t.c:2:1] {
    int var;

    [t.c:3:5] var = 0;
    <D.1801>:
    [t.c:7:8] var = var + 1;
    [t.c:7:8] goto <D.1801>;
    [t.c:10:8] D.1803 = 0;
    [t.c:10:8] return D.1803;

seen at GIMPLE.  Of course we lose the label once we build the CFG,
but we retain a goto-locus which we could then put back on the
jump statement.  For this case we at the moment get

.L2:
        .loc 1 7 0 discriminator 1
        addl    $1, -4(%rbp)
        jmp     .L2

and we could do

.L2:
        .loc 1 7 0 discriminator 1
        addl    $1, -4(%rbp)
        .loc 1 5 0
        jmp     .L2

thus assign the "destination" location to the jump instruction?

The first question is of course what happens with the edges
goto_locus at the moment and why we get the code we get.

The above solution might also be a bit odd since for the loop
entry we'd first see line 7 and only after that line 5.  But fixing
that would mean we have to output an extra instruction
(where I'd chose a nop instead of some random extra jump).

Richard.

> Thanks,
> Luis

Reply via email to