Am 14.11.2011 21:12, schrieb Anthony Liguori: > On 11/14/2011 02:11 PM, Kevin Wolf wrote: >> Am 14.11.2011 20:49, schrieb Anthony Liguori: >>> On 11/14/2011 01:46 PM, Juan Quintela wrote: >>>> Anthony Liguori<aligu...@us.ibm.com> wrote: >>>>> On 11/14/2011 07:11 AM, Juan Quintela wrote: >>>>>> >>>>>>> diff --git a/cpus.c b/cpus.c >>>>>>> index 82530c4..ae5ec99 100644 >>>>>>> --- a/cpus.c >>>>>>> +++ b/cpus.c >>>>>>> @@ -398,6 +398,7 @@ static void do_vm_stop(RunState state) >>>>>>> vm_state_notify(0, state); >>>>>>> qemu_aio_flush(); >>>>>>> bdrv_flush_all(); >>>>>>> + bdrv_invalidate_cache_all(); >>>>>>> monitor_protocol_event(QEVENT_STOP, NULL); >>>>>>> } >>>>>> >>>>>> This is too much. Reopening all qcow2 images each time that we stop the >>>>>> vm looks excesive, no? >>>>> >>>>> This general code came in via: >>>>> >>>>> http://mid.gmane.org/cover.1290613959.git....@redhat.com >>>>> >>>>> That series made migration stable after issuing a stop operation. I >>>>> believe the justification was for debugging purposes or something like >>>>> that. >>>>> >>>>> At any rate, invalidating the cache is part of what's required to make >>>>> things stable. If you look at something like cache=unsafe, the only >>>>> way the metadata will get flushed if via a bdrv_close since bdrv_flush >>>>> is a nop. >>>>> >>>>> So this is needed as long as we care about supporting this use-case. >>>> >>>> Then we need a "proper" qcow2 invalidate call. Doing in qemu toplevel: >>>> >>>> (qemu)stop >>>> >>>> And now all your qcow2 block devices are closed, or perhaps failing to >>>> re-open() looks too much to me (TM). >>>> >>>> Kevin? >>> >>> Look closely at the patch. It doesn't actually close()/open() anything. >>> >>> It just invokes the bdrv_close() routine which calls the free functions on >>> the >>> l1/l2 caching functions. bdrv_open() doesn't actually open anything (it >>> assumes >>> the file is already open. It just reads the header and metadata over again. >>> >>> For something that's basically a hack, it turned out to work very cleanly >>> :-) >> >> But why do we need to do it on stop? >> >> I don't think it makes even sense logically: bdrv_invalidate_cache() >> means "throw all your caches away and refetch everything from disk". >> What do we gain from doing this on stop? To some degree I could >> understand if you did it on cont, so that you can modify an image on the >> host while the VM is stopped (though I would still consider it criminal >> :-)). > > Michael basically was trying to avoid having a VM's state change after you > stopped the guest. > > With something like cache=unsafe that periodically flushes based on a timer > (I > think), you want to make sure that that doesn't happen after stop occurs.
This is a good point, but neither does cache=unsafe use a timer nor can I see how invalidating the cache would avoid such behaviour. And throwing away any unwritten changes doesn't really make it better. Kevin