On Wed Mar 13, 2024 at 7:03 AM AEST, Alex Bennée wrote: > "Nicholas Piggin" <npig...@gmail.com> writes: > > > On Tue Mar 12, 2024 at 11:33 PM AEST, Alex Bennée wrote: > >> Nicholas Piggin <npig...@gmail.com> writes: > >> > >> > This reverts commit 1f881ea4a444ef36a8b6907b0b82be4b3af253a2. > >> > > >> > That commit causes reverse_debugging.py test failures, and does > >> > not seem to solve the root cause of the problem x86-64 still > >> > hangs in record/replay tests. > >> > >> I'm still finding the reverse debugging tests failing with this series. > > > > :( > > > > In gitlab CI or your own testing? What are you running exactly? > > My own - my mistake I didn't get a clean build because of the format > bug. However I'm seeing new failures: > > env QEMU_TEST_FLAKY_TESTS=1 AVOCADO_TIMEOUT_EXPECTED=1 ./pyvenv/bin/avocado > run ./tests/avocado/reverse_debugging.py > Fetching asset from > ./tests/avocado/reverse_debugging.py:ReverseDebugging_AArch64.test_aarch64_virt > JOB ID : bd4b29f7afaa24dc6e32933ea9bc5e46bbc3a5a4 > JOB LOG : > /home/alex/avocado/job-results/job-2024-03-12T20.58-bd4b29f/job.log > (1/5) > ./tests/avocado/reverse_debugging.py:ReverseDebugging_X86_64.test_x86_64_pc: > PASS (4.49 s) > (2/5) > ./tests/avocado/reverse_debugging.py:ReverseDebugging_X86_64.test_x86_64_q35: > PASS (4.50 s) > (3/5) > ./tests/avocado/reverse_debugging.py:ReverseDebugging_AArch64.test_aarch64_virt: > FAIL: Invalid PC (read ffff2d941e4d7f28 instead of ffff2d941e4d7f2c) (3.06 s)
Okay, this is the new test I added. It runs for 1 second then reverse-steps from the end of the trace. aarch64 is flaky -- pc is at a different place at the same icount after the reverse-step (which is basically the second replay). This indicates some non-determinism in execution, or something in machine reset or migration is not restoring the state exactly. aarch64 ran okay few times including gitlab CI before I posted the series, but turns out it does break quite often too. x86 has a problem with this too so I disabled it there. I'll disable it for aarch64 too for now. x86 and aarch64 can run the replay_linux.py test quite well (after this series), which is much longer and more complicated. The difference there is that it is only a single replay, it never resets the machine or loads the initial snapshot for reverse-debugging. So to me that indicates that execution is probably deterministic, but its the reset reload that has the problem. Thanks, Nick