On 01/23/2017 11:29 AM, Max Reitz wrote: > On 23.01.2017 14:30, Alberto Garcia wrote: >> On Sat 21 Jan 2017 04:27:42 PM CET, Max Reitz wrote: >>>>> I have optimized these checks. See: >>>>> >>>>> http://lists.nongnu.org/archive/html/qemu-block/2015-08/msg00020.html >>>>> >>>>> Feel free to review. If you want, I can rebase the series. >>>>> >>>>> So far nobody has seriously done so because it wasn't considered >>>>> important enough, and the patches are not exactly trivial. >>>> >>>> I see. I wonder if it's worth complicating the code so much more >>>> instead of simply disabling the tests. >>>> >>>> For the 'refcount-block' check, which is the one that has an >>>> immediate impact in all kinds of images, I was wondering if we could >>>> simply use the image size to determine how many entries need to be >>>> checked. This would keep things much faster unless the image grows >>>> abnormally bigger. >>> >>> Have you tested that it helps? I would assume that checking one >>> (mostly) unused cluster of the refcount table does not take longer >>> than checking a single refcount block. Any entry that's 0 will just be >>> skipped. >> >> The problem is that there's 8000 entries to be skipped. In most cases it >> doesn't make a difference because the bottleneck is in the hard drive, >> but if it's fast enough then it starts to show. You can see it easily if >> you store the qcow2 image in RAM. > > No, the problem is that I was just wrong. :-) > > By "checking a single refcount block" I meant iterating through its > entries. But that's wrong, you don't have to do that. > > > Another thing we might think about is just disabling the overlap check > by defaults for the refcount structures. We could just emit a warning > somewhere else when seeing overly large refcounts (i.e. greater than the > number of internal snapshots plus one). My reasoning is the following: > > L1/L2 structures are more important. When overwriting them, you have a > problem. Refcount data can always be recalculated. > > Refcount data will only be queried when writing data to the image. If > that data has been overwritten, we have a chance that it is being set to > 0 (which is rather large because 0 generally has a higher probability of > being a part of data, admittedly). But we also have a chance that it is > set to something else, which generally will be greater than the number > of internal snapshots (+ 1). Therefore, such corruption should be easily > detectable before much data is wrongly overwritten. > > The drawbacks with this approach would be the following: > (1) Is printing a warning enough to make the user shut down the VM as > fast as possible and run qemu-img check?
No. > (2) It is legal to have a greater refcount than the number of internal > snapshots plus one. qemu never produces such images, though (or does > it?). Could there be existing images where users will be just annoyed by > such warnings? Should we add a runtime option to disable them? > I'd assume it's not legal? If you have a refcount > 1 and delete the last reference, wouldn't that refcount be then incorrect...? Or I guess maybe an external tool may be playing games and using it as a "sticky" bit of sorts? > > And of course another approach I already mentioned would be to scrap the > overlap checks altogether once we have image locking (and I guess we can > keep them around in their current form at least until then). > > Max > I think it's worth having them. Perhaps if you use image locking they can be disabled by default, but I wouldn't really advocate for removing them. I think what you are pointing out with refcount blocks not being essential to protect is probably pretty sufficient as an optimization, too. Or I guess we can merge your ~real~ series to help optimize things. I know in the little qcheck utility I wrote I use RBtrees of ranges that get merged together; the tool doesn't really care which kinds of metadata it is, it just knows that it has "metadata" that should be protected. I think it's fast enough. If the qcow2 is kept fairly defragmented, it's REALLY fast. IIRC, your patchset has something similar, so it should be fast enough too. --js
