It appears as sometimes it takes more time for Xen even start booting,
mostly due to firmware and fetching large boot files by grub. In some
jobs the current timeout is pretty close to the actual time needed, and
sometimes (rarely for now) test fails due to timeout expiring in the
middle of dom0 booting. This will be happening more often if the
initramfs will grow (and with more complex tests).
This has been observed on some dom0pvh-hvm jobs, at least on runners hw3
and hw11.

Switch to using expect (console.exp) for more robust test output
handling. This allows waiting separately for Xen starting to boot and
then for the test to complete. For now, set both of those to 120s, which
pessimistically bumps timeout for the whole test to 240s (from 120s).

Some messages use regex, use 'expect -re' for all of them for
consistency, even though not all strictly need that (yet).

Signed-off-by: Marek Marczykowski-Górecki <marma...@invisiblethingslab.com>
---
Changes in v2:
- replace previous "ci: increase timeout for hw tests" with changing how
  console is interacted with

This needs a containers rebuild.
---
 automation/build/alpine/3.18-arm64v8.dockerfile |  1 +-
 automation/scripts/console.exp                  | 23 ++++++--
 automation/scripts/qubes-x86-64.sh              | 52 ++++--------------
 3 files changed, 32 insertions(+), 44 deletions(-)

diff --git a/automation/build/alpine/3.18-arm64v8.dockerfile 
b/automation/build/alpine/3.18-arm64v8.dockerfile
index 19fe46f8418f..b8482d5bf43f 100644
--- a/automation/build/alpine/3.18-arm64v8.dockerfile
+++ b/automation/build/alpine/3.18-arm64v8.dockerfile
@@ -48,3 +48,4 @@ RUN apk --no-cache add \
   # qubes test deps
   openssh-client \
   fakeroot \
+  expect \
diff --git a/automation/scripts/console.exp b/automation/scripts/console.exp
index 31ce97b91b63..d1689fa5bf7f 100755
--- a/automation/scripts/console.exp
+++ b/automation/scripts/console.exp
@@ -28,21 +28,34 @@ if {[info exists env(UBOOT_CMD)]} {
     send "$env(UBOOT_CMD)\r"
 }
 
+if {[info exists env(BOOT_MSG)]} {
+    expect -re "$env(BOOT_MSG)"
+}
+
+if {[info exists env(WAKEUP_CMD)]} {
+    expect -re "$env(SUSPEND_MSG)"
+
+    # keep it suspended a bit, then wakeup
+    sleep 30
+
+    system "$env(WAKEUP_CMD)"
+}
+
 if {[info exists env(LOG_MSG)]} {
     expect {
-        "$env(PASSED)" {
-            expect "$env(LOG_MSG)"
+        -re "$env(PASSED)" {
+            expect -re "$env(LOG_MSG)"
             exit 0
         }
-        "$env(LOG_MSG)" {
-            expect "$env(PASSED)"
+        -re "$env(LOG_MSG)" {
+            expect -re "$env(PASSED)"
             exit 0
         }
     }
 }
 
 expect {
-    "$env(PASSED)" {
+    -re "$env(PASSED)" {
         exit 0
     }
 }
diff --git a/automation/scripts/qubes-x86-64.sh 
b/automation/scripts/qubes-x86-64.sh
index 8e78b7984e98..0eac410f4168 100755
--- a/automation/scripts/qubes-x86-64.sh
+++ b/automation/scripts/qubes-x86-64.sh
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-set -ex
+set -ex -o pipefail
 
 # One of:
 #  - ""             PV dom0,  PVH domU
@@ -263,52 +263,26 @@ cp -f binaries/xen $TFTP/xen
 cp -f binaries/bzImage $TFTP/vmlinuz
 cp -f binaries/dom0-rootfs.cpio.gz $TFTP/initrd-dom0
 
-# start logging the serial; this gives interactive console, don't close its
-# stdin to not close it; the 'cat' is important, plain redirection would hang
-# until somebody opens the pipe; opening and closing the pipe is used to close
-# the console
-mkfifo /tmp/console-stdin
-cat /tmp/console-stdin |\
-ssh $CONTROLLER console | tee smoke.serial | sed 's/\r//' &
-
 # start the system pointing at gitlab-ci predefined config
 ssh $CONTROLLER gitlabci poweron
-trap "ssh $CONTROLLER poweroff; : > /tmp/console-stdin" EXIT
+trap "ssh $CONTROLLER poweroff" EXIT
 
 if [ -n "$wait_and_wakeup" ]; then
-    # wait for suspend or a timeout
-    until grep "$wait_and_wakeup" smoke.serial || [ $timeout -le 0 ]; do
-        sleep 1;
-        : $((--timeout))
-    done
-    if [ $timeout -le 0 ]; then
-        echo "ERROR: suspend timeout, aborting"
-        exit 1
-    fi
-    # keep it suspended a bit, then wakeup
-    sleep 30
-    ssh $CONTROLLER wake
+    export SUSPEND_MSG="$wait_and_wakeup"
+    export WAKEUP_CMD="ssh $CONTROLLER wake"
 fi
 
-set +x
-until grep "^Welcome to Alpine Linux" smoke.serial || [ $timeout -le 0 ]; do
-    sleep 1;
-    : $((--timeout))
-done
-set -x
-
-tail -n 100 smoke.serial
-
-if [ $timeout -le 0 ]; then
-    echo "ERROR: test timeout, aborting"
-    exit 1
-fi
+export PASSED="${passed}"
+export BOOT_MSG="Latest ChangeSet: "
+export LOG_MSG="\nWelcome to Alpine Linux"
+export TEST_CMD="ssh $CONTROLLER console"
+export TEST_LOG="smoke.serial"
+export TEST_TIMEOUT="$timeout"
+./automation/scripts/console.exp | sed 's/\r\+$//'
+TEST_RESULT=$?
 
 if [ -n "$retrieve_xml" ]; then
     nc -w 10 "$SUT_ADDR" 8080 > tests-junit.xml </dev/null
 fi
 
-sleep 1
-
-(grep -q "^Welcome to Alpine Linux" smoke.serial && grep -q "${passed}" 
smoke.serial) || exit 1
-exit 0
+exit "$TEST_RESULT"
-- 
git-series 0.9.1

Reply via email to