On Thu, Jul 20, 2017 at 4:43 PM, Eduardo Habkost <ehabk...@redhat.com> wrote: > On Thu, Jul 20, 2017 at 03:14:28PM +0200, Amador Pahim wrote: >> On Thu, Jul 20, 2017 at 1:58 PM, Markus Armbruster <arm...@redhat.com> wrote: >> > Amador Pahim <apa...@redhat.com> writes: >> > >> >> When launching a VM, if an exception happens and the VM is not >> >> initiated, it is useful to see the qemu command line that was executed >> >> and the output of that command. >> >> >> >> Before the patch: >> >> >> >> >>> VM = qemu.QEMUMachine('../aarch64-softmmu/qemu-system-aarch64') >> >> >>> VM.launch() >> >> Traceback (most recent call last): >> >> File "<stdin>", line 1, in <module> >> >> File "qemu.py", line 137, in launch >> >> self._post_launch() >> >> File "qemu.py", line 121, in _post_launch >> >> self._qmp.accept() >> >> File "qmp/qmp.py", line 145, in accept >> >> self.__sock, _ = self.__sock.accept() >> >> File "/usr/lib64/python2.7/socket.py", line 206, in accept >> >> sock, addr = self._sock.accept() >> >> socket.timeout: timed out >> >> >> >> After the patch: >> >> >> >> >>> VM = qemu.QEMUMachine('../aarch64-softmmu/qemu-system-aarch64') >> >> >>> VM.launch() >> >> Traceback (most recent call last): >> >> File "<stdin>", line 1, in <module> >> >> File "qemu.py", line 156, in launch >> >> raise RuntimeError(msg) >> >> RuntimeError: Error launching VM. >> >> Original Exception: >> >> Traceback (most recent call last): >> >> File "qemu.py", line 138, in launch >> >> self._post_launch() >> >> File "qemu.py", line 122, in _post_launch >> >> self._qmp.accept() >> >> File "qmp/qmp.py", line 145, in accept >> >> self.__sock, _ = self.__sock.accept() >> >> File "/usr/lib64/python2.7/socket.py", line 206, in accept >> >> sock, addr = self._sock.accept() >> >> timeout: timed out >> >> Command: >> >> /usr/bin/qemu-system-aarch64 -chardev socket,id=mon, >> >> path=/var/tmp/qemu-23958-monitor.sock -mon chardev=mon,mode=control >> >> -display none -vga none >> >> Output: >> >> qemu-system-aarch64: No machine specified, and there is no default >> >> Use -machine help to list supported machines >> >> >> >> Also, if the launch() faces an exception, the 'except' now will use args >> >> to fill the debug information. So this patch assigns 'args' earlier, >> >> assuring it will be available for the 'except'. >> >> >> >> Signed-off-by: Amador Pahim <apa...@redhat.com> >> >> Reviewed-by: Fam Zheng <f...@redhat.com> >> >> --- >> >> scripts/qemu.py | 18 ++++++++++++++++-- >> >> 1 file changed, 16 insertions(+), 2 deletions(-) >> >> >> >> diff --git a/scripts/qemu.py b/scripts/qemu.py >> >> index f0fade32bd..2707ae7f75 100644 >> >> --- a/scripts/qemu.py >> >> +++ b/scripts/qemu.py >> >> @@ -18,6 +18,7 @@ import os >> >> import sys >> >> import subprocess >> >> import qmp.qmp >> >> +import traceback >> >> >> >> >> >> class QEMUMachine(object): >> >> @@ -129,17 +130,30 @@ class QEMUMachine(object): >> >> '''Launch the VM and establish a QMP connection''' >> >> devnull = open('/dev/null', 'rb') >> >> qemulog = open(self._qemu_log_path, 'wb') >> >> + args = self._wrapper + [self._binary] + self._base_args() + >> >> self.args >> >> try: >> >> self._pre_launch() >> >> - args = self._wrapper + [self._binary] + self._base_args() + >> >> self._args >> >> self._popen = subprocess.Popen(args, stdin=devnull, >> >> stdout=qemulog, >> >> stderr=subprocess.STDOUT, >> >> shell=False) >> >> self._post_launch() >> > >> > Lifting the assignment out of the try makes sense, because exceptions >> > there are programming errors. >> > >> > Why are ._pre_launch() and ._post_launch() inside the try? >> >> The _pre_launch()/_post_launch() are probably the main source of >> exceptions here and I assume we want to handle the exceptions coming >> from them. >> >> The call to subprocess.Popen() is unlikely to raise exceptions. I >> mean, it's easy to trigger an 'OSError' exception providing a >> non-existent command to Popen, but once we provide something >> executable, we will not see an exception here if the command fails. >> The CalledProcessError is only raised by subprocess.check_call() or >> subprocess.check_output(). >> >> > > That's correct. QEMU exiting won't cause an exception, it > doesn't matter the error code. Failing to connect to QMP or > qtest sockets on _post_launch() will. > > However, another purpose of the try/except block is to ensure we > clean up the process and log file in case _any_ exception happens > (programming error or not). That's why everything after > open(self._qemu_log_path) is inside the try block. We should be > careful to not break that. > > I think this code really needs to be cleaned up. The existing > try/except block has a huge number of responsibilities: > 1) Ensuring we kill the process in case _post_launch() raises an > exception. > 2) Ensuring we load the log file even if Popen() or _post_launch() > raise an exception. > 3) Ensuring we remove the monitor socket if anything after > _pre_launch() raises an exception (through _post_shutdown()) > 4) Ensuring the log file is deleted if anything after > open(_qemu_log_path) raises an exception (through > _post_shutdown()) > 5) Ensuring _post_shutdown() is called if anything after > _pre_launch() raises an exception (e.g. QEMUQTestMachine > deletes the socket created by _pre_launch() inside > _post_shutdown() > 6) (New, added by this patch) Generating a more useful exception > message if something (what exactly?) fails. > > Some of the cases above are just programming errors, but it's > still nice to do proper cleanup in case of programming errors. > e.g.: if _base_args() raise an exception, we can ensure we delete > the log file anyway (this patch breaks that). > > If we don't clean this up, future cleanups (e.g. moving > _pre_launch() outside the try block) would easily break one or > more responsibilities above, and this would be very easy to miss > on patch review.
Agreed. Working on that. > > >> > >> >> except: >> > >> > Why do we catch all exceptions, and not just >> > subprocess.CalledProcessError? > > See above. > >> > >> >> + self._load_io_log() >> >> if self.is_running(): >> > >> > I suspect we can't get here only for "odd" exceptions, not for >> > subprocess.CalledProcessError. > > It depends on what you mean by "odd". > >> > >> >> self._popen.kill() >> >> self._popen.wait() >> >> - self._load_io_log() >> >> + else: >> >> + exc_type, exc_value, exc_traceback = sys.exc_info() >> >> + msg = ('Error launching VM.\n' >> >> + 'Original Exception: \n%s' >> >> + 'Command:\n%s\n' >> >> + 'Output:\n%s\n' % >> >> + (''.join(traceback.format_exception(exc_type, >> >> + exc_value, >> >> + >> >> exc_traceback)), >> >> + ' '.join(args), >> >> + self._iolog)) >> >> + self._post_shutdown() >> >> + raise RuntimeError(msg) >> >> self._post_shutdown() >> >> raise >> > >> > Why do we print a traceback when subprocess.CalledProcessError? >> >> subprocess.CalledProcessError will not happen with subprocess.Popen, >> even if the command fails. But if the virtual machine does not start, >> we will, for example, see a 'socket.timeout' exception coming from >> _post_launch() when it tries to connect to the qemu monitor. > > Exactly. > > -- > Eduardo