On Wed, Aug 21, 2013 at 11:59:36AM +0200, Paolo Bonzini wrote: > Il 21/08/2013 11:42, Michael S. Tsirkin ha scritto: > > On Wed, Aug 21, 2013 at 10:18:23AM +0200, Paolo Bonzini wrote: > >> Il 21/08/2013 10:03, Marcel Apfelbaum ha scritto: > >>> On Wed, 2013-08-14 at 10:02 +0300, Ronen Hod wrote: > >>>> How about adding a flag that tells QEMU whether to pause or reboot the > >>>> guest > >>>> after the panic? > >>>> We cannot assume that we always have a management layer that takes care > >>>> of this. > >>>> One example is Microsoft's WHQL that deliberately generates a BSOD, and > >>>> then > >>>> examines the dump files. > >>> After this patch the pvpanic is not part of the global devices anymore so > >>> just > >>> don't enable it if you want to reboot on BSOD. > >>> In my opinion "reboot after panic" equals "run without pvpanic device" > >> > >> This is not entirely possible, since "reboot after panic" is a guest > >> setting while "run without pvpanic device" is a host setting (that the > >> guest administrator may not even have access to: Ronen's case is a good > >> example of this, because the "administrator" there is the WHQL harness). > >> > >> However, I think this is a driver problem. The driver should just probe > >> the "reboot after panic" setting and not issue the outb to the pvpanic > >> port. > > > > This might or might not be possible on different OS-es. > > What exactly is gained by doing vmstop on outb of pvpanic? > > Because events are edge-triggered, and can be lost if management dies at > the wrong time, each event that QEMU sends must go together with a way > for management to poll the state. > > For panic, the way to poll the state is "info status". This matches > what we do for watchdogs, for example. Management can issue "info > status" to learn of the panic state, even if it happens while management > itself is not running: > > libvirtd QEMU guest > --------------------------------------------------------------- > stops > <- pvpanic outb > emits panic event > (no one receives it) > starts > info status -> > <- PANICKED > > > Because there is only one running state, this means the VM has to be > stopped. > > But actually, fixing the driver would only be required if pvpanic were > mandatory. > > Now that pvpanic is optional, "reboot after panic" can also be fixed in > libvirt. Let's remove the "must reset after panic" limitation; then, > libvirt can simply do itself a "continue" after receiving the panicked > event (or after seeing that the guest is in panicked state). The > panicked event will never be sent unless management explicitly requests > it (with "-device pvpanic"), so backwards compatibility is preserved. > > The pause will still happen if management was stopped, but that's a fair > compromise IMHO. > > It will mean also that "reboot after panic" will be broken in 1.6.0, > unfortunately. Perhaps we can have a quick 1.6.1 release with this patch: > > diff --git a/vl.c b/vl.c > index 25b8f2f..25e890a 100644 > --- a/vl.c > +++ b/vl.c > @@ -685,8 +685,7 @@ int runstate_is_running(void) > bool runstate_needs_reset(void) > { > return runstate_check(RUN_STATE_INTERNAL_ERROR) || > - runstate_check(RUN_STATE_SHUTDOWN) || > - runstate_check(RUN_STATE_GUEST_PANICKED); > + runstate_check(RUN_STATE_SHUTDOWN); > } > > StatusInfo *qmp_query_status(Error **errp) > > > By the way, this means two things: > > - I am now sold on the idea that explicitly enabling of pvpanic is the > right thing to do; > > - on the other hand this is the proof that the change was not fully > understood, and rushing it in 1.6 was the wrong thing to do. > > Paolo
You mean 1.5. pvpanic was a builtin in 1.5 and that was clearly the wrong thing to do. We fixed that in 1.6, thankfully. > > We want a notification about the panic but > > adding yet another way to halt seems kind of useless. > > Why not let VM continue? If it wants to stop it > > can always call halt. >