Re: reverse-{debugging,continue} not working on v7.2.0, i386 guest

Pavel Dovgalyuk Thu, 19 Jan 2023 01:31:10 -0800

On 19.01.2023 07:40, Hyeonggon Yoo wrote:

On Wed, Jan 18, 2023 at 12:39:16PM +0300, Pavel Dovgalyuk wrote:

Sometimes replay (or reverse debugging) have problems due to incomplete or
incorrect virtual device save/load implementation.


Can you try removing -cpu from your command line?

Or you can provide the files you load and I'll debug this case.


Ah, sorry to bother. I installed breakpoint _after_ kernel panic,
and installing breakpoint before boot worked fine. Every seems great!


Glad to hear that.


Just a side question, is there a reason QEMU record/replay
does not support -smp N (> 1)? is this feature planed, or should I use
other tools to debug SMP bugs?


Parallel SMP deterministic emulation is very hard.

However, I think multiple-cores-on-single-core deterministic emulationwill be supported someday.

On 18.01.2023 11:47, Hyeonggon Yoo wrote:

On Wed, Jan 18, 2023 at 10:12:48AM +0300, Pavel Dovgalyuk wrote:

As replay works well, the reverse debugging should be ok too.
But for "going back" it needs a VM snapshot that can be used for reload.

Snapshots are saved on qcow2 images connected to QEMU.
Therefore you need to add an empty qcow2 to your command line with the
following option: -drive file=empty.qcow2,if=none,id=rr


Oh, I guessed it's possible to reverse-debug without snapshot,
and your comments definitely helped! adding empty disk and snapshotting solved 
it.

But I faced another problem:

(gdb) b __list_del_entry_valid
(gdb) reverse-continue

(it stuck forever)
^C
(gdb) info registers
eax            0xefe19f74          -270426252
ecx            0x0                 0
edx            0xefe19f74          -270426252
ebx            0xf6ff4620          -151042528
esp            0xc02e9a34          0xc02e9a34
ebp            0xc02e9a6c          0xc02e9a6c
esi            0xc4fffb20          -989856992
edi            0xefe19f70          -270426256
eip            0xc1f38400          0xc1f38400 <__list_del_entry_valid>
eflags         0x6                 [ IOPL=0 PF ]
cs             0x60                96
ss             0x68                104
ds             0x7b                123
es             0x7b                123
fs             0xd8                216
gs             0x0                 0
fs_base        0x31cb4000          835403776
gs_base        0x0                 0
k_gs_base      0x0                 0
cr0            0x80050033          [ PG AM WP NE ET MP PE ]
cr2            0xffcb1000          -3469312
cr3            0x534e000           [ PDBR=0 PCID=0 ]
cr4            0x406d0             [ PSE MCE PGE OSFXSR OSXMMEXCPT OSXSAVE ]
cr8            0x1                 1
efer           0x0                 [ ]

it stuck here and it's not 'last breakpoint hit' from the panic
(it's early in boot), and stepi, nexti, continue commands do not work and
there's no forward progress. (eip doesn't change)

Did I miss something or did something wrong?

thank you so much with your help.

--
Best regards,
Hyeonggon


And you also need to add rrsnapshot to icount for creating the snapshot at
the start of VM execution:
-icount shift=auto,rr=record,rrfile=$REPLAY_FILE,rrsnapshot=start


On 18.01.2023 09:14, Hyeonggon Yoo wrote:

Hello QEMU folks.
I was struggling to fix a recent heisenbug in the Linux kernel,
and fortunately the bug was reproducible with TCG and -smp 1.

I'm using qemu version 7.2.0, and guest architecture is i386.
I tried to inspect the bug using record/replay and reverse-debugging
feature in the QEMU.


recorded with:

qemu-system-i386 \
           -icount shift=auto,rr=record,rrfile=$REPLAY_FILE \
           -kernel arch/x86/boot/bzImage \
           -cpu SandyBridge \
           -initrd debian-i386.cgz \
           -smp 1 \
           -m 1024 \
           -nographic \
           -net none \
           -append "page_owner=on console=ttyS0"

and replayed with:

qemu-system-i386 \
           -icount shift=auto,rr=replay,rrfile=$REPLAY_FILE \
           -kernel arch/x86/boot/bzImage \
           -cpu SandyBridge \
           -initrd debian-i386.cgz \
           -smp 1 \
           -m 1024 \
           -nographic \
           -net none \
           -s \
           -append "page_owner=on console=ttyS0"

(I'm using a initrd image instead of a disk file.)

The record and replay works well. The bug is reliably reproduced
when relaying. but when I try to reverse-continue or reverse-stepi after
kernel panic, the gdb only says:

        "remote failure reply 'E14'"

Is there something I'm missing, or record/replay do not work with
QEMU v7.2.0 or i386?

--
Best regards,
Hyeonggon

Re: reverse-{debugging,continue} not working on v7.2.0, i386 guest

Reply via email to