Hi,

We are running into an issue using GDB with our RTOS on QEMU x86_64 system 
emulator without KVM. The same works well when we re-run the test with KVM 
enabled.

The scenario is following:

  *   ptrace sets up a HW watchpoint using debug register
  *   ptrace sets up a user program single-stepping through TF bit in EFLAGS 
register
  *   ptrace kicks the program
  *   the program hits the next instruction which shall cause a debug exception 
due to watchpoint event
  *   the program shall also cause the same debug exception due to single-step 
event

In case of KVM QEMU mode - TF bit in EFLAGS causes "int 1" in user mode, which 
is handled by LOS178 ptrace_debugerr() function (user mode debug exception 
handler), which does the following steps:

  *   clears TF bit from EFLAGS
  *   checks Debug-Status Register (DR6)
  *   since watchpoint bit is set (B0 bit in DR6) - single-step bit from DR6 is 
cleared and watchpoint event is handled.

This sequence is described in Debug Status Register (DR6) section of Intel 64 
SDM 
(https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html)
 as follows:
* BS (single step) flag (bit 14) - Indicates (when set) that the debug 
exception was triggered by the single-
step execution mode (enabled with the TF flag in the EFLAGS register). The 
single-step mode is the highest-
priority debug exception. When the BS flag is set, any of the other debug 
status bits also may be set

However, in case of no-KVM QEMU mode CPU behaves differently:

  *   debug exception is triggered because of watchpoint event, execution 
enters etrap1 function (beginning of watchpoint event handling)
  *   TF bit in EFLAGS causes CPU to execute int 1 in kernel mode inside of 
etrap1
  *   debug exception in kernel mode is handled by SKDB, if it's installed, so 
LOS178 enters SKDB
  *   SKDB doesn't ignore any debug exception, even though it didn't use any 
debugging utilities (like BPs, WPs, single-step), so it puts the LOS178 into 
SKDB command line mode

This sequence can be proved by the CPU state after SKDB is entered, here we see 
that etrap1 was interrupted in the beginning:
* t
fp=0x00007ffffc67bee8, pc=0xffffffff80005348 <etrap1>, sp=0x00007fffffffdfc0

DR6 (DSTAT) value shows there are 2 bits set - B0 (watchpoint) and BS 
(single-step):
* r
...
                                DR0                                   DR1       
                          DR2                                 DR3
0000000000213610 0000000000000000 0000000000000000 0000000000000000
                              (DR4)                              (DR5)          
              DSTAT                           DCTRL
0000000000000000 0000000000000000 00000000ffff4ff1 00000000001d0402

I've tried a few different -cpu flags for qemu, but it doesn't change behavior, 
so it appears that no-KVM QEMU implementation may have a bug, which makes CPU 
to execute exceptions in a wrong order for this scenario.
I didn't have a chance to reproduce the same scenario on Linux, but looking 
through the source code it appears that at least KGDB verifies did it set a 
single-step, has the user thread single-step flag set, and ignore the 
unexpected single-step exceptions so it may handle this case seamlessly, so the 
issue might be hidden on Linux running atop no-KVM QEMU.

Any thoughts or further steps?

Thanks,
Ravi Bhagavatula

Reply via email to