Hi Daniel,
On 4/3/25 19:33, Daniel P. Berrangé wrote:
There are two race conditions in the recently added virtio balloon
test
* The /dev/vda device node is not ready
* The virtio-balloon driver has not issued the first stats refresh
To fix the former, monitor dmesg for a line about 'vda'.
To fix the latter, retry the stats query until seeing fresh data.
Adding 'quiet' to the kernel command line reduces serial output
which otherwise slows boot, making it less likely to hit the former
race too.
Signed-off-by: Daniel P. Berrangé <berra...@redhat.com>
---
tests/functional/test_virtio_balloon.py | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/tests/functional/test_virtio_balloon.py b/tests/functional/
test_virtio_balloon.py
index 67b48e1b4e..308d197eb3 100755
--- a/tests/functional/test_virtio_balloon.py
+++ b/tests/functional/test_virtio_balloon.py
@@ -32,7 +32,7 @@ class VirtioBalloonx86(QemuSystemTest):
'e3c1b309d9203604922d6e255c2c5d098a309c2d46215d8fc026954f3c5c27a0')
DEFAULT_KERNEL_PARAMS = ('root=/dev/vda1 console=ttyS0 net.ifnames=0 '
- 'rd.rescue')
+ 'rd.rescue quiet')
def wait_for_console_pattern(self, success_message, vm=None):
wait_for_console_pattern(
@@ -47,6 +47,9 @@ def mount_root(self):
prompt = '# '
self.wait_for_console_pattern(prompt)
+ # Synchronize on virtio-block driver creating the root device
+ exec_command_and_wait_for_pattern(self, "while ! (dmesg -c |
grep vda:) ; do sleep 1 ; done", "vda1")
+
exec_command_and_wait_for_pattern(self, 'mount /dev/vda1 /
sysroot',
prompt)
exec_command_and_wait_for_pattern(self, 'chroot /sysroot',
@@ -65,10 +68,21 @@ def assert_initial_stats(self):
assert val == UNSET_STATS_VALUE
def assert_running_stats(self, then):
- ret = self.vm.qmp('qom-get',
- {'path': '/machine/peripheral/balloon',
- 'property': 'guest-stats'})['return']
- when = ret.get('last-update')
+ # We told the QEMU to refresh stats every 100ms, but
+ # there can be a delay between virtio-ballon driver
+ # being modprobed and seeing the first stats refresh
+ # Retry a few times for robustness under heavy load
+ retries = 10
+ when = 0
+ while when == 0 and retries:
+ ret = self.vm.qmp('qom-get',
+ {'path': '/machine/peripheral/balloon',
+ 'property': 'guest-stats'})['return']
+ when = ret.get('last-update')
+ if when == 0:
+ retries = retries - 1
+ time.sleep(0.5)
+
now = time.time()
assert when > then and when < now
Unfortunately I'm still getting a timeout:
https://gitlab.com/philmd/qemu/-/jobs/9318095233
2025-03-05 12:09:55,360 - DEBUG: Console interaction:
success_msg='Entering emergency mode.' failure_msg='Kernel panic - not
syncing' send_string='None'
2025-03-05 12:09:55,360 - DEBUG: Opening console socket
2025-03-05 12:10:32,722 - DEBUG: Console interaction: success_msg='# '
failure_msg='Kernel panic - not syncing' send_string='None'
2025-03-05 12:10:32,823 - DEBUG: Console interaction: success_msg='vda1'
failure_msg='None' send_string='while ! (dmesg -c | grep vda:) ; do sleep
1 ; done
2025-03-05 12:10:30,534: Warning: /dev/vda1 does not exist
2025-03-05 12:10:30,535:
2025-03-05 12:10:30,598: Generating "/run/initramfs/rdsosreport.txt"
2025-03-05 12:10:32,720:
2025-03-05 12:10:32,721:
2025-03-05 12:10:32,722: Entering emergency mode.
2025-03-05 12:10:32,724: Exit the shell to continue.
2025-03-05 12:10:32,726: Type "journalctl" to view system logs.
2025-03-05 12:10:32,727: You might want to save "/run/initramfs/
rdsosreport.txt" to a USB stick or /boot
2025-03-05 12:10:32,728: after mounting them and attach it to a bug report.
2025-03-05 12:10:32,729:
2025-03-05 12:10:32,731:
2025-03-05 12:10:32,823: :/#