On 7/14/20 8:13 PM, John Snow wrote: > > > On 7/14/20 12:13 AM, Cleber Rosa wrote: >> On Fri, Jul 10, 2020 at 01:06:47AM -0400, John Snow wrote: >>> This is done primarily to avoid the 'bare except' pattern, which >>> suppresses all exceptions during shutdown and can obscure errors. >>> >>> Replace this with a pattern that isolates the different kind of shutdown >>> paradigms (_hard_shutdown and _soft_shutdown), and a new fallback shutdown >>> handler (_do_shutdown) that gracefully attempts one before the other. >>> >>> This split now also ensures that no matter what happens, >>> _post_shutdown() is always invoked. >>> >>> shutdown() changes in behavior such that if it attempts to do a graceful >>> shutdown and is unable to, it will now always raise an exception to >>> indicate this. This can be avoided by the test writer in three ways: >>> >>> 1. If the VM is expected to have already exited or is in the process of >>> exiting, wait() can be used instead of shutdown() to clean up resources >>> instead. This helps avoid race conditions in shutdown. >>> >>> 2. If a test writer is expecting graceful shutdown to fail, shutdown >>> should be called in a try...except block. >>> >>> 3. If the test writer has no interest in performing a graceful shutdown >>> at all, kill() can be used instead. >>> >>> >>> Handling shutdown in this way makes it much more explicit which type of >>> shutdown we want and allows the library to report problems with this >>> process. >>> >>> Signed-off-by: John Snow <js...@redhat.com> >>> --- >>> python/qemu/machine.py | 95 +++++++++++++++++++++++++++++++++++------- >>> 1 file changed, 80 insertions(+), 15 deletions(-) >>> >>> diff --git a/python/qemu/machine.py b/python/qemu/machine.py >>> index aaa173f046..b24ce8a268 100644 >>> --- a/python/qemu/machine.py >>> +++ b/python/qemu/machine.py >>> @@ -48,6 +48,12 @@ class QEMUMachineAddDeviceError(QEMUMachineError): >>> """ >>> >>> >>> +class AbnormalShutdown(QEMUMachineError): >>> + """ >>> + Exception raised when a graceful shutdown was requested, but not >>> performed. >>> + """ >>> + >>> + >>> class MonitorResponseError(qmp.QMPError): >>> """ >>> Represents erroneous QMP monitor reply >>> @@ -365,6 +371,7 @@ def _early_cleanup(self) -> None: >>> """ >>> Perform any cleanup that needs to happen before the VM exits. >>> >>> + May be invoked by both soft and hard shutdown in failover >>> scenarios. >>> Called additionally by _post_shutdown for comprehensive cleanup. >>> """ >>> # If we keep the console socket open, we may deadlock waiting >>> @@ -374,32 +381,90 @@ def _early_cleanup(self) -> None: >>> self._console_socket.close() >>> self._console_socket = None >>> >>> + def _hard_shutdown(self) -> None: >>> + """ >>> + Perform early cleanup, kill the VM, and wait for it to terminate. >>> + >>> + :raise subprocess.Timeout: When timeout is exceeds 60 seconds >>> + waiting for the QEMU process to terminate. >>> + """ >>> + self._early_cleanup() >> >> Like I commented on patch 5, I don't think the *current* type of >> cleanup done is needed on a scenario like this... >> >>> + self._popen.kill() >> >> ... as I don't remember QEMU's SIGKILL handler to be susceptible to >> the race condition that motivated the closing of the console file in >> the first place. But, I also can not prove it's not susceptible at >> this time. >> > > It probably isn't. It was for consistency in when and where that hook is > called, again. It does happen to be "pointless" here. > >> Note: I have some old patches that added tests for QEMUMachine itself. >> I intend to respin them on top of your work, so we may have a clearer >> understanding of the QEMU behaviors we need to handle. So, feel free >> to take the prudent route here, and keep the early cleanup. >> > > Oh, adding formal tests to this folder would be incredible; especially > if we wanted to package it on PyPI. Basically a necessity. > >> Reviewed-by: Cleber Rosa <cr...@redhat.com> >> Tested-by: Cleber Rosa <cr...@redhat.com> >> >
Queuing with this hunk from jsnow: -- >8 -- @@ -455,6 +444,9 @@ def shutdown(self, has_quit: bool = False, Terminate the VM (gracefully if possible) and perform cleanup. Cleanup will always be performed. + If the VM has not yet been launched, or shutdown(), wait(), or kill() + have already been called, this method does nothing. + :param has_quit: When true, do not attempt to issue 'quit' QMP command. :param hard: When true, do not attempt graceful shutdown, and suppress the SIGKILL warning log message. ---