> Other than that, the structure of the patch looks OK, but > I think you need to identify the cause of the problems > with SMP setups that you mention in the cover letter, > since they suggest that there's a bug lurking in here > somewhere.
In the current patch, in `hvf_arch_update_guest_debug()` I'm enabling exiting the guest on debug exceptions only for the vCPUs that have inserted software/hardware breakpoint or are singlestepping. In SMP setups this logic looks flawed, since for example if vCPU #1 sets a software breakpoint and vCPU #2 hits it, the generated debug exception for vCPU #2 will not exit the guest and lead to panic for unexpected BRK. A possible fix is enabling exiting the guest on debug exceptions for all vCPUs (and not just the ones that have inserted breakpoints)—is this the way to go? There's also a second analogous issue for which it feels like I'm missing something. If through GDB a software breakpoint is inserted from vCPU #1 and later vCPU #2 hits it, then when trying to resume execution after the hit GDB fails with 'Cannot remove breakpoints', due to `hvf_find_sw_breakpoint()` returning error because it (correctly) doesn't find any software breakpoint for vCPU #2 (queue `cpu->hvf->hvf_sw_breakpoints`). A possible fix seems to be modifying `hvf_find_sw_breakpoint()` so that it searches for the breakpoint on all vCPUs' queues; but I've skimmed through the analogous routines for TCG and KVM and there's nothing resembling this fix, so I wonder why TCG and KVM don't fail on my example GDB scenario? With both proposed fixes the patch seems to work well also in SMP setups in my limited tests. For the next round, I'll also split the patch as suggested. Thanks, Francesco