Am 13.09.2010 21:29, schrieb Stefan Hajnoczi: > On Mon, Sep 13, 2010 at 3:13 PM, Kevin Wolf <kw...@redhat.com> wrote: >> Am 13.09.2010 15:42, schrieb Anthony Liguori: >>> On 09/13/2010 08:39 AM, Kevin Wolf wrote: >>>>> Yeah, one of the key design points of live migration is to minimize the >>>>> number of failure scenarios where you lose a VM. If someone typed the >>>>> wrong command line or shared storage hasn't been mounted yet and we >>>>> delay failure until live migration is in the critical path, that would >>>>> be terribly unfortunate. >>>>> >>>> We would catch most of them if we try to open the image when migration >>>> starts and immediately close it again until migration is (almost) >>>> completed, so that no other code can possibly use it before the source >>>> has really closed it. >>>> >>> >>> I think the only real advantage is that we fix NFS migration, right? >> >> That's the one that we know about, yes. >> >> The rest is not a specific scenario, but a strong feeling that having an >> image opened twice at the same time feels dangerous. As soon as an >> open/close sequence writes to the image for some format, we probably >> have a bug. For example, what about this mounted flag that you were >> discussing for QED? > > There is some room left to work in, even if we can't check in open(). > One idea would be to do the check asynchronously once I/O begins. It > is actually easy to check L1/L2 tables as they are loaded. > > The only barrier relationship between I/O and checking is that an > allocating write (which will need to update L1/L2 tables) is only > allowed after check completes. Otherwise reads and non-allocating > writes may proceed while the image is not yet fully checked. We can > detect when a table element is an invalid offset and discard it.
I'm not even talking about such complicated things. You wanted to have a dirty flag in the header, right? So when we allow opening an image twice, you get this sequence with migration: Source: open Destination: open (with dirty image) Source: close The image is now marked as clean, even though the destination is still working on it. Kevin