On Wed, 03/01 10:49, Kevin Wolf wrote: > Am 01.03.2017 um 09:15 hat Fam Zheng geschrieben: > > In an ideal world, read-write access to an image is inherently > > exclusive, because we cannot guarantee other readers and writers a > > consistency view of the whole image at all point. That's what the > > current permission system does, and it is okay as long as it is entirely > > internal. But that would change with the coming image locking. In > > practice, both end users and our test cases use tools like qemu-img and > > qemu-io to peek at images while guest is running. > > > > Relax a bit and accept other writers in this case. > > > > Signed-off-by: Fam Zheng <f...@redhat.com> > > Hm. On the one hand I think you're right that people are using things > like this, on other hand it's also true that the result might not be > consistent and therefore image locking is right about forbidding these > actions. > > I think your patch is too permissive, it allows even launching a second > long-running VM on the image, which will definitely see corrupted data > sooner or later. > > Maybe what we can do is allow shared writers for read-only images if > CONSISTENT_READ isn't requested. It's still not 100% correct because we > can get inconsistent metadata and cause unexpected failure, but this is > probably tolerable. We would then have to change the allowed actions to > not request this permission. > > In qemu-img, we have these read-only users: > > * qemu-img check (without -r): Let's keep this blocked, it will only > report lots of leaks and leads to invalid bug reports. I've had my > share of them. > > * qemu-img compare: Not sure if this makes sense with an image that is > in active use? > > * qemu-img convert source: Similarly to qemu-img compare, this doesn't > make a whole lot of sense, with one exception: People are using -s to > extract internal snapshots while the VM is still running, and this > usually works because the snapshot doesn't change. However, I don't > want to know what happens if you delete the snapshort from the qemu > process while qemu-img convert is running... Doing 'qemu-img snapshot > -s ...' with a running VM is more tolerated abuse than a supported > feature. > > * qemu-img info: This one makes perfect sense even with a running VM > > * qemu-img map: Results are potentially meaningless with concurrent I/O, > but there may be cases where it makes sense. > > * qemu-img snapshot -l: Somewhat similar to qemu-img convert -s, except > that it's very quick and doesn't access actual data (could even be > BDRV_O_NOIO, I think). Allowing this makes sense, I guess. > > * qemu-img rebase, safe mode, backing files: If we allowed concurrent > writes, it wouldn't be safe. > > * qemu-img bench: Well... No. You don't need this on an image with an > active VM. > > * qemu-img dd: Same as convert, except that there is no -s. > > So the list of subcommands where we want to support it is rather short. > We can change blk_new_open() to clear CONSISTENT_READ for BDRV_O_NOIO, > which could cover 'info' and 'snapshot -l'. > > That leaves us with qemu-io, 'convert -s' and 'map', all of which can be > imagined to be useful even with a running VM, but all of which can also > easily produce wrong results in this case. An explicit command line > option to disable CONSISTENT_READ might be the right tool here.
I'm not sure about this because: 1) this is intrusive from a user PoV, many scripts and upper layer tools will stop working; 2) CONSISTENT_READ is enforced by qcow2 in its .bdrv_child_perm implementation even if blk_new_open() doesn't ask for it, therefore such an option has to impact the whole graph; 3) this isn't only about asking for "persistent read" perm, but more about granting "write" in shared_perm, so it feels messy. A perhaps more contained way is to add a BDRV_O_RELAXED_LOCK flag and use it in those subcommands, then in the image locking code, the "no other writer" byte is not locked if it's set. This has a simpler semantic and a more manageable scope. Fam