On Tue, 12 Jan 2021 11:32:44 +0000 Alex Bennée <alex.ben...@linaro.org> wrote:
> Cornelia Huck <coh...@redhat.com> writes: > > > On Fri, 8 Jan 2021 19:56:45 +0100 > > Thomas Huth <th...@redhat.com> wrote: > > > >> There was a race condition in the first test where there was already the > >> "crw" output in the dmesg, but the "0.0.4711" entry has not been created > >> in the /sys fs yet. Fix it by waiting until it is there. > >> > >> The second test has even more problems on gitlab-CI. Even after adding some > >> more synchronization points (that wait for some messages in the "dmesg" > >> output to make sure that the modules got loaded correctly), there are still > >> occasionally some hangs in this test when it is running in the gitlab-CI. > >> So far I was unable to reproduce these hangs locally on my computer, so > >> this issue might take a while to debug. Thus disable the 2nd test in the > >> gitlab-CI until the problems are better understood and fixed. > >> > >> Signed-off-by: Thomas Huth <th...@redhat.com> > >> --- > >> tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++-- > >> 1 file changed, 12 insertions(+), 2 deletions(-) > >> > >> diff --git a/tests/acceptance/machine_s390_ccw_virtio.py > >> b/tests/acceptance/machine_s390_ccw_virtio.py > >> index eccf26b262..4028c99afc 100644 > >> --- a/tests/acceptance/machine_s390_ccw_virtio.py > >> +++ b/tests/acceptance/machine_s390_ccw_virtio.py > >> @@ -12,6 +12,7 @@ > >> import os > >> import tempfile > >> > >> +from avocado import skipIf > >> from avocado_qemu import Test > >> from avocado_qemu import exec_command_and_wait_for_pattern > >> from avocado_qemu import wait_for_console_pattern > >> @@ -133,8 +134,10 @@ class S390CCWVirtioMachine(Test): > >> self.vm.command('device_add', driver='virtio-net-ccw', > >> devno='fe.0.4711', id='net_4711') > >> self.wait_for_crw_reports() > >> - exec_command_and_wait_for_pattern(self, 'ls > >> /sys/bus/ccw/devices/', > >> - '0.0.4711') > >> + exec_command_and_wait_for_pattern(self, 'for i in 1 2 3 4 5 6 7 ; > >> do ' > >> + 'if [ -e /sys/bus/ccw/devices/*4711 ]; then break; fi > >> ;' > >> + 'sleep 1 ; done ; ls /sys/bus/ccw/devices/', > >> + '0.0.4711') > > > > I'm wondering whether we should introduce a generic helper function for > > "execute command repeatedly, if the expected result did not yet show > > up", or "wait for a file/directory to exist". It's probably not > > uncommon for a desired outcome to arrive asynchronously, and having a > > function for waiting/retrying could be handy. > > We don't really want to encourage fragile shell scripts in the guest so > something that makes it easy to encode these loops in python. Currently > the _console_interaction helper fails the test if failure_message is > seen so I guess we need a slightly more liberal interaction which > accepts a command can fail so we can write something like: > > while True: > if exec_command_and_check(self, "stat -t /sys/bus/ccw/devices/0.0.4711", > "/sys/bus/ccw/devices/0.0.4711"): > break > > ? Yes, something like that. The caller can decide whether they want to limit retries. > > > > >> # and detach it again > >> self.clear_guest_dmesg() > >> self.vm.command('device_del', id='net_4711') > >