[Bug debug/84550] [8 Regression] stepping through gcc does not work with gdb 8.0.1

2018-03-06 Thread kevinb at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84550

--- Comment #11 from Kevin Buettner  ---
I've simplified Jakub's example slightly:

--- vau2.c ---
struct A { int a; };
struct B { struct A *b; };
struct C { struct B *c; };

__attribute__((noipa)) bool
foo (struct A *p)
{
  return false;
}

__attribute__((noipa)) int
baz (int x)
{
  return 0;
}

__attribute__((noipa)) void
qux (struct C *p)
{
  struct A *a;
  bool b;
  int c;

  if (!p->c) __builtin_abort ();
  a = p->c->b;

  b = (a->a == 4)
&& (foo (a));

  c = baz (0);
  baz (b);
}

int
main ()
{
  struct A a = { 4 };
  struct B b = { &a };
  struct C c = { &b };
  qux (&c);
  return 0;
}
--- end vau2.c ---

When I compile this via "g++ -O2 -g vau2.c -o vau2", and load it into gdb, it
exhibits the same behavior shown by Jakub.  I.e. the following sequence...

b qux
run
s
s
s

... does not stop in foo as expected.  The program instead exits.

It turns out that qux consists of two disjoint pieces which, for some reason,
are separated by a lot of other code, e.g. main, _start, foo, baz, and a lot of
other stuff too.  Here's what it looks like:

   0x400460 :  callq  0x400430 
   0x400465:nopw   %cs:0x0(%rax,%rax,1)
   0x40046f:nop
   0x400470 :   sub$0x28,%rsp
   0x400474 : lea0xc(%rsp),%rax
   0x400479 : lea0x18(%rsp),%rdi
   0x40047e :movl   $0x4,0xc(%rsp)
   0x400486 :mov%rax,0x10(%rsp)
   0x40048b :lea0x10(%rsp),%rax
   0x400490 :mov%rax,0x18(%rsp)
   0x400495 :callq  0x4005b0 <_Z3quxP1C>
   0x40049a :xor%eax,%eax
   0x40049c :add$0x28,%rsp
   0x4004a0 :retq   
   0x4004a1:nopw   %cs:0x0(%rax,%rax,1)
   0x4004ab:nopl   0x0(%rax,%rax,1)
   0x4004b0 <_start>:   xor%ebp,%ebp
   0x4004b2 <_start+2>: mov%rdx,%r9
   0x4004b5 <_start+5>: pop%rsi
   0x4004b6 <_start+6>: mov%rsp,%rdx
   0x4004b9 <_start+9>: and$0xfff0,%rsp
   0x4004bd <_start+13>:push   %rax
   0x4004be <_start+14>:push   %rsp
   0x4004bf <_start+15>:mov$0x400660,%r8
   0x4004c6 <_start+22>:mov$0x4005f0,%rcx
   0x4004cd <_start+29>:mov$0x400470,%rdi
   0x4004d4 <_start+36>:callq  0x400440 <__libc_start_main@plt>
   0x4004d9 <_start+41>:hlt
...
   0x400590 :  xor%eax,%eax
   0x400592 :retq   
   0x400593:nopl   (%rax)
   0x400596:nopw   %cs:0x0(%rax,%rax,1)
   0x4005a0 : xor%eax,%eax
   0x4005a2 :   retq   
   0x4005a3:nopl   (%rax)
   0x4005a6:nopw   %cs:0x0(%rax,%rax,1)
   0x4005b0 <_Z3quxP1C>:push   %rbx
   0x4005b1 <_Z3quxP1C+1>:  mov(%rdi),%rax
   0x4005b4 <_Z3quxP1C+4>:  test   %rax,%rax
   0x4005b7 <_Z3quxP1C+7>:  je 0x400460 
   0x4005bd <_Z3quxP1C+13>: mov(%rax),%rdi
   0x4005c0 <_Z3quxP1C+16>: xor%ebx,%ebx
   0x4005c2 <_Z3quxP1C+18>: cmpl   $0x4,(%rdi)
   0x4005c5 <_Z3quxP1C+21>: je 0x4005d8 <_Z3quxP1C+40>
   0x4005c7 <_Z3quxP1C+23>: xor%edi,%edi
   0x4005c9 <_Z3quxP1C+25>: callq  0x4005a0 
   0x4005ce <_Z3quxP1C+30>: mov%ebx,%edi
   0x4005d0 <_Z3quxP1C+32>: pop%rbx
   0x4005d1 <_Z3quxP1C+33>: jmp0x4005a0 
   0x4005d3 <_Z3quxP1C+35>: nopl   0x0(%rax,%rax,1)
   0x4005d8 <_Z3quxP1C+40>: callq  0x400590 
   0x4005dd <_Z3quxP1C+45>: xor%edi,%edi
   0x4005df <_Z3quxP1C+47>: movzbl %al,%ebx
   0x4005e2 <_Z3quxP1C+50>: callq  0x4005a0 
   0x4005e7 <_Z3quxP1C+55>: mov%ebx,%edi
   0x4005e9 <_Z3quxP1C+57>: pop%rbx
   0x4005ea <_Z3quxP1C+58>: jmp0x4005a0 
   0x4005ec:nopl   0x0(%rax)

Within GDB, this is where things go wrong:

top-gdb> bt 4
#0  find_pc_partial_function_gnu_ifunc (pc=4195728, name=0x7fffdd28, 
address=0x7fffdd18, endaddr=0x7fffdd20, is_gnu_ifunc_p=0x0)
at
/ironwood1/sourceware-git/mesquite-native-thread_handle_to_thread_info/bld/../../binutils-gdb/gdb/blockframe.c:213
#1  0x00553281 in find_pc_partial_function (pc=4195728, 
name=0x7fffdd28, address=0x7fffdd18, endaddr=0x7fffdd20)
at
/ironwood1/sourceware-git/mesquite-native-thread_handle_to_thread_info/bld/../../binutils-gdb/gdb/blockframe.c:323
#2  0x006b6aec in fill_in_stop_func (gdbarch=0x1249170, 
ecs=0x7fffdcd0)
at
/ironwood1/sourceware-git/mesquite-native-thread_handle_to_thread_info/bld/../../binutils-gdb/gdb/infrun.c:4303
#3  0x006baf13 in process_event_stop_test (ecs=0x7fffdcd0)
at
/ironwood1/sourceware-git/mesquite-native-thread_handle_to_thread_info/bld/../../binutils-gdb/gdb/infrun.c:6494

That pc value is actually the first address for foo(), which is what we want:

top-gdb> p/x pc
$22 = 0x400590

(Refer to my disassembly above to verify this.)

This code, which is in find_pc_partial_function_gnu_ifunc(), incorrectly
identifies this address, 0x400590, as belonging to qux:

  if (mapped_pc >= cache_pc_function_low
  && mapped_pc < ca

[Bug debug/84550] [8 Regression] stepping through gcc does not work with gdb 8.0.1

2018-03-06 Thread kevinb at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84550

--- Comment #12 from Kevin Buettner  ---
I'll note, too, that just setting a breakpoint on qux and then looking at the
locations reveals another problem...

(gdb) b qux
Breakpoint 1 at 0x400460: qux. (2 locations)
(gdb) info break
Num Type   Disp Enb AddressWhat
1   breakpoint keep y
1.1 y 0x00400460 in qux(C*) at vau2.c:24
1.2 y 0x004005b0 in qux(C*) at vau2.c:24
(gdb) x/4i 0x400460
   0x400460 :  callq  0x400430 
   0x400465:nopw   %cs:0x0(%rax,%rax,1)
   0x40046f:nop
   0x400470 :   sub$0x28,%rsp
(gdb) x/4i 0x4005b0
   0x4005b0 <_Z3quxP1C>:push   %rbx
   0x4005b1 <_Z3quxP1C+1>:  mov(%rdi),%rax
   0x4005b4 <_Z3quxP1C+4>:  test   %rax,%rax
   0x4005b7 <_Z3quxP1C+7>:  je 0x400460 

Placing a breakpoint on 0x400460 is incorrect since this is not an actual entry
point to the function.

[Bug debug/84550] [8 Regression] stepping through gcc does not work with gdb 8.0.1

2018-03-06 Thread kevinb at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84550

--- Comment #13 from Kevin Buettner  ---
(In reply to Kevin Buettner from comment #11)

> This code, which is in find_pc_partial_function_gnu_ifunc(), incorrectly
> identifies this address, 0x400590, as belonging to qux:
> 
>   if (mapped_pc >= cache_pc_function_low
>   && mapped_pc < cache_pc_function_high
>   && section == cache_pc_function_section)
> goto return_cached_value;

I've determined that if this code is disabled, then things work.  I'm not
suggesting this as a fix, just adding another data point.

[Bug debug/84550] [8 Regression] stepping through gcc does not work with gdb 8.0.1

2018-03-08 Thread kevinb at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84550

--- Comment #15 from Kevin Buettner  ---
I've been focusing my attention on dwarf2read.c (in GDB).  I have a patch which
fixes this problem, but which introduces a bunch of test suite regressions. 
(So it's not a very good patch.)  I'll be away on Friday, but will resume
looking at it when I return on Monday.

Anyway, with that not-very-good patch in place, this is what I see:

(gdb) b qux
Breakpoint 1 at 0x4005b0: file vau2.c, line 24.
(gdb) x/i 0x400460
   0x400460 <_Z3quxP1C.cold.0>: callq  0x400430 
(gdb) 
   0x400465:nopw   %cs:0x0(%rax,%rax,1)
(gdb) 
   0x40046f:nop
(gdb) run
Starting program: /mesquite2/.ironwood2/84550/vau2 

Breakpoint 1, qux (p=0x7fffe098) at vau2.c:24
24if (!p->c) __builtin_abort ();
(gdb) s
25a = p->c->b;
(gdb) s
27b = (a->a == 4)
(gdb) s
foo (p=0x7fffe08c) at vau2.c:8
8 return false;

Note that only one location is set for the breakpoint in qux.  Also, I'm now
able to step into foo().

[Bug debug/84550] [8 Regression] stepping through gcc does not work with gdb 8.0.1

2018-03-15 Thread kevinb at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84550

--- Comment #16 from Kevin Buettner  ---
Created attachment 43671
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43671&action=edit
GDB patch - dwarf2read.c

I've attached the GDB patch that I'm currently testing.  When I try it against
either Jakub's test case or in the one that I only slightly simplified, I find
that I can step into foo() as expected.

I see no regressions when testing against the GDB testsuite using either

1) /bin/gcc on my Fedora 23 machine

or

2) A build of GCC using recent development sources

However, if I use #2 along with -O2, I see some apparent regressions.  I say
"apparent" because the ones that I've investigated so far aren't really
regressions.  It turns out that -O2 causes much of the initial part of the
function (upon which a breakpoint is being set) to be optimized away and having
an extra breakpoint on the .cold location will sometimes cause GDB to consider
a test with a "continue" to be a success due to being able to hit some other
breakpoint, even if it's not at the correct line/location. (Yes, these tests
should probably be revised so that doesn't happen. That said, the gdb testsuite
doesn't really work very well with everything compiled with -O2 anyway.)