On Wed, Jul 19, 2017 at 03:34:47PM -0300, Eduardo Habkost wrote: > On Wed, Jul 19, 2017 at 06:31:06PM +0200, Amador Pahim wrote: > > Current implementation is broken. It does not really test if the child > > process is running. > > > > The Popen.returncode will only be set after by a poll(), wait() or > > communicate(). If the Popen fails to launch a VM, the Popen.returncode > > will not turn to None by itself. > > > > Instead of using Popen.returncode, let's use Popen.poll(), which > > actually checks if child process has terminated. > > > > Signed-off-by: Amador Pahim <apa...@redhat.com> > > I vaguely remember I had a version of that code using poll() and > it broke scripts for some reason. I will try to find out why, so > we can either fix the script or document the reason why poll() > isn't a good choice here.
Thanks to git reflog, I found the original "fix" I had in my WIP tree: 251fc73 work/device-crash-script@{71}: commit: fixup! qemu.py: Don't set _popen=None on error/shutdown diff --git a/scripts/qemu.py b/scripts/qemu.py index 4dae811..cbc9e2a 100644 --- a/scripts/qemu.py +++ b/scripts/qemu.py @@ -86,7 +86,7 @@ class QEMUMachine(object): raise def is_running(self): - return self._popen and (self._popen.poll() is None) + return self._popen and (self._popen.returncode is None) def exitcode(self): if self._popen: @@ -137,6 +137,7 @@ class QEMUMachine(object): except: if self.is_running(): self._popen.kill() + self._popen.wait() self._load_io_log() self._post_shutdown() raise The original bug was like this: if QEMU process took a little longer to be actually terminated after self._popen.kill() was called, it triggering post-shutdown code inside shutdown() (because is_running() was still True), causing the following exception: Traceback (most recent call last): File "./scripts/device-crash-test.py", line 528, in <module> sys.exit(main()) File "./scripts/device-crash-test.py", line 487, in main f = checkOneCase(args, t) File "./scripts/device-crash-test.py", line 320, in checkOneCase vm.shutdown() File "/home/ehabkost/rh/proj/virt/qemu/scripts/qemu.py", line 156, in shutdown self._load_io_log() File "/home/ehabkost/rh/proj/virt/qemu/scripts/qemu.py", line 101, in _load_io_log with open(self._qemu_log_path, "r") as fh: IOError: [Errno 2] No such file or directory: '/var/tmp/qemu-23568.log' My fix was incorrect: the actual bug was the missing self._popen.wait() call after self._popen.kill(), not the self._popen.poll() call. Your fix looks good and device-crash-test is not crashing. Reviewed-by: Eduardo Habkost <ehabk...@redhat.com> > > > --- > > scripts/qemu.py | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/scripts/qemu.py b/scripts/qemu.py > > index 880e3e8219..f0fade32bd 100644 > > --- a/scripts/qemu.py > > +++ b/scripts/qemu.py > > @@ -86,7 +86,7 @@ class QEMUMachine(object): > > raise > > > > def is_running(self): > > - return self._popen and (self._popen.returncode is None) > > + return self._popen and (self._popen.poll() is None) > > > > def exitcode(self): > > if self._popen is None: > > -- > > 2.13.3 > > > > -- > Eduardo -- Eduardo