On Wed, Jan 20, 2021 at 12:28:14AM -0500, Alejandro Jimenez wrote: > > > On 1/19/2021 4:34 PM, Peter Maydell wrote: > > On Tue, 15 Dec 2020 at 18:11, Paolo Bonzini <pbonz...@redhat.com> wrote: > > > From: Alejandro Jimenez <alejandro.j.jime...@oracle.com> > > > > > > The current default action of pausing a guest after a panic event > > > is received leaves the responsibility to resume guest execution to the > > > management layer. The reasons for this behavior are discussed here: > > > https://lore.kernel.org/qemu-devel/52148f88.5000...@redhat.com/ > > > > > > However, in instances like the case of older guests (Linux and > > > Windows) using a pvpanic device but missing support for the > > > PVPANIC_CRASHLOADED event, and Windows guests using the hv-crash > > > enlightenment, it is desirable to allow the guests to continue > > > running after sending a PVPANIC_PANICKED event. This allows such > > > guests to proceed to capture a crash dump and automatically reboot > > > without intervention of a management layer. > > > > > > Add an option to avoid stopping a VM after a panic event is received, > > > by passing: > > > > > > -action panic=none > > > > > > in the command line arguments, or during runtime by using an upcoming > > > QMP command. > > Hi. This commit message doesn't say it's changing the default > > action, but the change does: > > > > > @@ -3899,6 +3899,8 @@ DEF("action", HAS_ARG, QEMU_OPTION_action, > > > " action when guest reboots [default=none]\n" > > > "-action shutdown=poweroff|pause\n" > > > " action when guest shuts down > > > [default=poweroff]\n" > > > + "-action panic=poweroff|pause|none\n" > > > + " action when guest panics [default=poweroff]\n" > > > "-action > > > watchdog=reset|shutdown|poweroff|inject-nmi|pause|debug|none\n" > > > " action when watchdog fires [default=reset]\n", > > > QEMU_ARCH_ALL) > > > RebootAction reboot_action = REBOOT_ACTION_NONE; > > > ShutdownAction shutdown_action = SHUTDOWN_ACTION_POWEROFF; > > > +PanicAction panic_action = PANIC_ACTION_POWEROFF; > > We used to default to 'pause' and now we default to 'poweroff'. > Hi Peter. > > My rationale for setting the panic action to 'poweroff' was to keep the > default behavior of QEMU when '-no-shutdown' is not specified, and a panic > occurs. I believe that in order to accomplish that, the default panic action > should still be 'poweroff', but as you point out there is an instance where > the behavior changes. Specifically, when '-no-shutdown' is not used there is > now one fewer QMP event issued when a guest panic is detected, before > stopping the VM and powering off. > > I tried to account for this scenario in the original patches, but I failed > to catch the problem after the rebase when the changes were merged. I'll > test and send a fix for this issue in the next few days. > > > > > We noticed this because it broke an in-flight test case for > > the pvpanic-pci device from Mihai (which was expecting to see > > the device in 'pause' state and found it was now in 'poweroff'). > The test is just checking for the arrival of the QMP event, and not actually > expecting the VM to be paused, correct? I see that if a test/management app > is expecting to receive a GUEST_PANICKED event with the specific 'pause' > action, then it might be confused. But any such tests would only be able to > check for the arrival of the QMP event, and not actually expect to issue any > commands to a paused VM, since the next block of code in QEMU immediately > powers off and shutdowns when '-no-shutdown' is not requested. This was the > typical behavior before the patches. > > > Test cases aren't very exciting, but was it really intentional > > to change the default behaviour? > My intention was to preserve the default behavior. Perhaps Paolo wanted to > reduce the number of GUEST_PANICKED events by removing the one with 'pause' > action? You could consider it superfluous since it is immediately followed > by another indicating the 'poweroff' action... Unless I hear otherwise from > either of you, I'll work on a fix to keep the same number and type of events > sent.
Why would pause be immediately followed by poweroff ? These are independant actions, and the mgmt app should be deciding what todo next after the pause action. It may wish to capture a guest crash image before poweroff. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|