Am 23.12.2015 um 11:47 hat Daniel P. Berrange geschrieben: > On Wed, Dec 23, 2015 at 11:14:12AM +0800, Fam Zheng wrote: > > On Tue, 12/22 17:46, Kevin Wolf wrote: > > > Enough innocent images have died because users called 'qemu-img snapshot' > > > while > > > the VM was still running. Educating the users doesn't seem to be a working > > > strategy, so this series adds locking to qcow2 that refuses to access the > > > image > > > read-write from two processes. > > > > > > Eric, this will require a libvirt update to deal with qemu crashes which > > > leave > > > locked images behind. The simplest thinkable way would be to > > > unconditionally > > > override the lock in libvirt whenever the option is present. In that case, > > > libvirt VMs would be protected against concurrent non-libvirt accesses, > > > but not > > > the other way round. If you want more than that, libvirt would have to > > > check > > > somehow if it was its own VM that used the image and left the lock > > > behind. I > > > imagine that can't be too hard either. > > > > The motivation is great, but I'm not sure I like the side-effect that an > > unclean shutdown will require a "forced" open, because it makes using qcow2 > > in > > development cumbersome, and like you said, management/user also needs to > > handle > > this explicitly. This is a bit of a personal preference, but it's strong > > enough > > that I want to speak up. > > Yeah, I am also not really a big fan of locking mechanisms which are not > automatically cleaned up on process exit. On the other hand you could > say that people who choose to run qemu-img manually are already taking > fate into their own hands, and ending up with a dirty image on unclean > exit is still miles better than loosing all your data. > > > As an alternative, can we introduce .bdrv_flock() in protocol drivers, with > > similar semantics to flock(2) or lockf(3)? That way all formats can benefit, > > and a program crash will automatically drop the lock. > > FWIW, the libvirt locking daemon (virtlockd) will already attempt to take > out locks using fcntl()/lockf() on all disk images associated with a VM.
Does this actually mean that if QEMU did try to use flock(), it would fail because libvirt is already holding the lock? I considered adding both locking schemes (the qcow2 flag for qcow2 on any backend; flock() for anything else on local files), but if this is true, that's game over for any flock() based patches. Kevin