Nabih Estefan <nabiheste...@google.com> writes: > This patch series modifies the gdbstub to address a bug running a > multi cluster machine in QEMU using TCG.
Was this a downstream multi-cluster machine? Do we have any examples for upstream? It would be nice to add a gdbstub test case to cover the multi-inferior behaviour. > The machine where the > problem was seen had several clusters of CPUs with similar > architectures and similar memory layout all working with physical > addresses. It was discovered under gdb debugging that a breakpoint > targeting one cluster misfired on the wrong cluster quite frequently > with no possible workaround since gdb was also unaware of any > breakpoint in that cluster and simply reported SIGTRAP. > > The sequence that discovered the bug adds N inferiors and adds a > breakpoint on inferior N. Then after continuing an unrelated thread > stops the execution when its PC hits the same address of the break > targeting a different inferior. > > target extended-remote :1234 > add-inferior > inferior 2 > attach 2 > ... > add-inferior > inferior N > attach N > add-symbol-file /path/to/foo.elf > break foo >> Breakpoint 1 at 0xf00add > info break >> Num Type Disp Enb Address What >> 1 breakpoint keep y 0x00f00add in foo >> at foo.c:1234 inf N > continue >> Continuing. >> >> Thread 2.1 received signal SIGTRAP, Trace/breakpoint trap. >> [Switching to Thread 2.2] >> 0xf00add in ?? () > > The main problem is the current implementation of > 'tcg_insert_breakpoint' and 'tcg_remove_breakpoint' insert and remove > breakpoints to all the CPUs in the system regardless of what the > remote gdb protocol implements. > > If we look at the current source code of GDB we can see that the > function 'remote_target::insert_breakpoint' in file gdb/remote.c has > the intention to select the process ID of the inferior where the > break was inserted. > > int > remote_target::insert_breakpoint (struct gdbarch *gdbarch, > struct bp_target_info *bp_tgt) > { > ... > /* Make sure the remote is pointing at the right process, if > necessary. */ > if (!gdbarch_has_global_breakpoints (current_inferior ()->arch ())) > set_general_process (); > ... > } > > https:sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/remote.c;h=2c3988cb5075655e8a799d1cc5d4760ad8ed426e;hb=HEAD#l11023 > > This would not happen when we input the 'break' in gdb but it is > deferred until the time we execute the 'continue' command. Because we > might be currently selecting an inferior that is not the one where we > previously set the breakpoint, the remote gdb has to make sure we > move the focus to the process ID of the inferior where we inserted > the break. > > In the previous example this will translate to something like: > > HgpN.M > Z0,00f00add,4 > > Another thing that is wrong with the current implementation (and it > affects both TCG and KVM mode) is that the remote gdb protocol uses > 'Hg' and not 'Hc' to select the process. Functions > 'gdb_breakpoint_insert' and 'gdb_breakpoint_remove' receive wrongly > 'gdbserver_state.c_cpu' instead of 'gdbserver_state.g_cpu'. > > This is supported by the documentation of 'H op thread-id' where op = > 'c' is reserved to the step and continue: > > https:sourceware.org/gdb/current/onlinedocs/gdb.html/Packets.html > > And it can be verified that the function 'set_general_process' in the > previous code snippet will eventually call > 'remote_target::set_general_thread' and not > 'remote_target::set_continue_thread' if it needs to change focus. > > A third scenario that has to be taken into account is the case of a > break on a specific thread, for instance the sequence: > > inferior 1 > break bar thread 1.3 > break bar thread 1.4 > > The remote protocol expects the gdbstub to apply the break to the > process ID of inferior 1 and considers the specific thread-id as a > breakpoint condition (not too different from a 'break if'). > > In this case the packet exchange may look like: > > Hgp1.1 > Z0,00ba2add,4 > > There wouldn't be an independent set of packets for 'Hgp1.3' and > 'Hgp1.4'. > > In the gdb source code, the handling of the specific thread-id > happens during breakpoint evaluation in function > 'bpstat_check_breakpoint_conditions' of file gdb/breakpoint.c > > https:sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/breakpoint.c;h=17bd627f867cf3d4dc81322ed1919ba40cbb237d;hb=HEAD#l5550 > > The proposed fix inserts or removes a breakpoint to all the cpus > sharing the same process ID as the one selected with the latest > received 'Hg' packet. > > Roque Arcudia Hernandez (2): > gdbstub: Fix wrong CPUState pointer in breakpoint functions > gdbstub: Apply breakpoints only to the selected PID > > accel/tcg/tcg-accel-ops.c | 37 +++++++++++++++++++++++-------------- > gdbstub/gdbstub.c | 10 ++++++++-- > gdbstub/internals.h | 13 +++++++++++-- > include/exec/gdbstub.h | 12 ++++++++++++ > 4 files changed, 54 insertions(+), 18 deletions(-) -- Alex Bennée Virtualisation Tech Lead @ Linaro