On 2011-10-05 17:43, Avi Kivity wrote: > On 10/05/2011 04:37 PM, Luiz Capitulino wrote: >> Now, we have three options to fix this but I don't know which one to >> choose: >> >> 1. We could just add the transition RSTATE_PAUSED -> >> RSTATE_POST_MIGRATE >> as valid. Not sure this is a good thing to do though, as it seems >> a silly >> workaround for the fact that the transition to RSTATE_PRE_MIGRATE >> has >> never occurred >> >> 2. This patch makes vm_stop() do the state transition even if the VM >> is already stopped. Seems good enough, except that I fear two >> things. >> First, today we know that vm_stop() is a no-op if the VM is already >> stopped, so there's a semantic change that could turn out to be >> trap. >> Second, I also fear people using vm_stop() as a way to change the >> VM status, just like runstate_set() (which can also become an >> horrible >> trap) >> >> 3. Avi suggested we should keep a reference count, so that states are >> not discarded: >> >> http://lists.gnu.org/archive/html/qemu-devel/2011-08/msg00595.html >> >> That solution seemed to be the perfect one, except for one important >> detail: how should we implement vm_start() (and thus 'cont')? >> >> In order to maintain how we behave with the external world, the only >> option is that vm_start() will set the stop count to 0. Ie, doesn't >> matter if we have stopped the VM 500 times at some point, a >> vm_start() >> call will discard all stored states. >> >> Not sure if that's what you expected, but the first time I read >> Avi's >> idea I had the impression that it would be a good idea that >> vm_start() >> decremented the ref count only once, ie. vm_stop() and vm_start() >> calls >> have to match. >> > > vm_start() should be symmetric with vm_stop(). That is, if a piece of > code wants to execute with vcpus stopped, it should just run inside a > stop/start pair. > > The only confusion can come from the user, if he sees multiple stop > events and expects that just one cont will continue the vm. For the > machine monitor, we should just document that the you have to issue one > cont for every stop event you see (plus any stops you issue). It's not > unnatural - the code that handles a stop_due_to_enospace can work to fix > the error and issue a cont, disregarding any other stops in progress > (due to a user pressing the stop button, or migration, or cpu hotplug, > or whatever). For the human monitor, it's not so intuitive, but the > situation is so rare we can just rely on the user to issue cont again.
Making this kind of user-visible change would be a bad idea. We are talking about multiple stop states here, but only a single function (vm_stop) to enter them - maybe that's not optimal. But the point is that we were missing one stop-to-stop transition. And that needs to be fixed, either inside vm_stop or when it is called. If you want to lock the VM into paused state, add a new symmetric API that does precisely this. That API would send the VM into RSTATE_LOCKED if it is not yet stopped on lock or is still locked on resume. That would avoid redefining stop states that have no use for lock-counting semantics. Jan
signature.asc
Description: OpenPGP digital signature