Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command

Michael Roth Thu, 05 Jan 2012 07:06:27 -0800

On 01/05/2012 06:59 AM, Daniel P. Berrange wrote:

On Thu, Jan 05, 2012 at 10:37:14AM -0200, Luiz Capitulino wrote:

On Thu, 5 Jan 2012 10:16:30 +0000
"Daniel P. Berrange"<berra...@redhat.com>  wrote:

On Wed, Jan 04, 2012 at 05:45:11PM -0200, Luiz Capitulino wrote:

This version drops modes 'sleep' and 'hybrid' because they don't work
properly due to issues in qemu. Only the 'hibernate' mode is supported
for now.


IMHO this is short-sighted. When the bugs QEMU in are fixed so that
these modes work, you have needlessly put users in the situation where
they have to now upgrade the guest agent everywhere to take advantage
of the bugfix.


That was my thinking until v4. But after discussing with Michael the issues
we have with S3 I concluded that it doesn't make sense to offer an API to
something that doesn't work, this will just generate bug reports. Also,
updating to get new features is normal and expected.


This is assuming that users will always upgrade their VMs&  hosts in
lock step, which I rather doubt they will in practice. eg imagine a
deployment might have a mixture of hosts, running QEMU 1.1 (broken S3)
and QEMU 1.2 (working S3). If they build VM disk images they will likely
use the QEMU GA from 1.2 for all their builds, even if many of them
will only run on QEMU 1.1 hosts. So you'll end up having 'sleep' and
'hybrid' commands available in the guest agent, even though the host
QEMU doesn't work properly.

So you *will* ultimately need to make sure that QEMU GA from 1.2, has
sensible behaviour when run on a QEMU 1.1 host.  If you don't address
this during 1.1, you may well find yourself in an un-winnable situation
for 1.2, where it is impossible to provide good behaviour on old hosts.

So IMHO we are better off in the long run, if we include all commands
right now, even though some don't work yet, and work to ensure we have
good error reporting behaviour for those that don't work.

As an example, if S3 is broken in current QEMU, then we should not be
advertizing S3 to the guest OS. This would allow 'pm-is-supported --suspend'
to return false, at which point the guest agent can send back a nice error
message 'Suspend is not supported on this host', instead of just having the
guest try to suspend&  hang or worse.

This still requires we're lockstep with host QEMU (ideally that would bethe case via push-deployment of the agent, exactly because of issueslike this. Or at least, it'd make the upgrade process painless). Andoutside of that, I really cannot think of any nice way to check, fromthe agent, that a host has required functionality for {this,an} RPC. Notunless we forced a bi-directional capabilities negotiation sequence, andI don't like the idea of injecting this kind of data into a guest.libvirt could maybe filter the modes based on QEMU version, but that'snot the only consumer of the agent.

Really I think this is a case study for why push-deployment of agents isthe way to go. QEMU could query qemu-ga directly and generate an 'agentupdate available' event that users/frontends can use to prompt an updateto the latest version. Then all the upgrade inertia involved with savingcode/features for subsequent agent versions is mostly gone, and we can"do the right thing" with regard to broken functionality :)

Unfortunately that option isn't available yet. But it just seems wrongto introduce something we know is broken, to the extent that even thoseinvolved with it's development aren't currently capable of testing itfully. I mean, we know what the user expectations are, and it's notthat, unfortunately for us :( I'd be more open to it if the bug wasn'tso bad, but nuking your guest's working state every time you make themistake of hitting the pretty "sleep" button in virt-manager or whateveris pretty bad.


Daniel

Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command

Reply via email to