On 20/02/2025 19.39, Peter Maydell wrote:
I'm trying to debug some functional tests that fail for me
with 'make check-functional' on a debug build. Consistently
(well, same set of tests in two runs) when I run
'make -j8 check-functional' these fail:

  7/44 qemu:func-thorough+func-arm-thorough+thorough / func-arm-arm_sx1
                         ERROR           173.31s   exit status 1
10/44 qemu:func-thorough+func-aarch64-thorough+thorough /
func-aarch64-aarch64_virt            TIMEOUT         720.04s   killed
by signal 15 SIGTERM
11/44 qemu:func-thorough+func-arm-thorough+thorough /
func-arm-arm_aspeed_ast2600              TIMEOUT         720.07s
killed by signal 15 SIGTERM
12/44 qemu:func-thorough+func-aarch64-thorough+thorough /
func-aarch64-aarch64_sbsaref_alpine  TIMEOUT         720.07s   killed
by signal 15 SIGTERM
40/44 qemu:func-thorough+func-arm-thorough+thorough /
func-arm-arm_aspeed_ast2500              TIMEOUT         480.01s
killed by signal 15 SIGTERM

The aarch64-virt one is gpu issue, so I know about that one.
The others pass OK on a clang no-debug sanitizer build.

If I try to run just the sx1 tests "by hand":

$ (cd build/x86 && PYTHONPATH=../../python:../../tests/functional
QEMU_TEST_QEMU_BINARY=./qemu-system-arm ./pyvenv/bin/python3
../../tests/functional/test_arm_sx1.py)
TAP version 13
ok 1 test_arm_sx1.SX1Test.test_arm_sx1_flash
ok 2 test_arm_sx1.SX1Test.test_arm_sx1_initrd
ok 3 test_arm_sx1.SX1Test.test_arm_sx1_sd
1..3

they pass; but inside the test framework that third sd test
errors: testlog-thorough.txt says:
[...]
timed out after 60 seconds
[...]
which I interpret to mean "we waited the 60 seconds the test says,
but the test didn't exit within that time".

Any suggestions for how to debug?

Some TCG-based tests are slowing down very much when running on a shared hyperthreaded 
CPU ... Do you have 8 real cores in your system, or rather 4 real cores with 2 SMT 
threads each? In the latter case, have a try whether "make -j4" works better.

We apparently also increased the timeout in this test in the past already, see 
commit 92ee59bf56ba42954166e56ab112afe10f3c7556 ... does it work better if you 
increase the timeout even further?

(Also the console.log is empty regardless of whether the
test passes or fails; this doesn't seem right.)

I think we only log the console output when we look for strings
in the output. Since this test does not look for any strings,
there is no log.
Something like this causes some log to be generated:

diff --git a/tests/functional/test_arm_sx1.py b/tests/functional/test_arm_sx1.py
--- a/tests/functional/test_arm_sx1.py
+++ b/tests/functional/test_arm_sx1.py
@@ -43,7 +43,8 @@ def test_arm_sx1_initrd(self):
         self.vm.add_args('-append', f'kunit.enable=0 rdinit=/sbin/init 
{self.CONSOLE_ARGS}')
         self.vm.add_args('-no-reboot')
         self.launch_kernel(zimage_path,
-                           initrd=initrd_path)
+                           initrd=initrd_path,
+                           wait_for='Boot successful')
         self.vm.wait(timeout=60)
def test_arm_sx1_sd(self):
@@ -54,7 +55,7 @@ def test_arm_sx1_sd(self):
         self.vm.add_args('-no-reboot')
         self.vm.add_args('-snapshot')
         self.vm.add_args('-drive', f'format=raw,if=sd,file={sd_fs_path}')
-        self.launch_kernel(zimage_path)
+        self.launch_kernel(zimage_path, wait_for='Boot successful')
         self.vm.wait(timeout=60)
def test_arm_sx1_flash(self):
@@ -65,7 +66,7 @@ def test_arm_sx1_flash(self):
         self.vm.add_args('-no-reboot')
         self.vm.add_args('-snapshot')
         self.vm.add_args('-drive', f'format=raw,if=pflash,file={flash_path}')
-        self.launch_kernel(zimage_path)
+        self.launch_kernel(zimage_path, wait_for='Boot successful')
         self.vm.wait(timeout=60)
if __name__ == '__main__':

... but maybe we should also provide a knob to flush the serial console
when tearing down the test setup?

 Thomas


Reply via email to