On 11/21/2014 03:56 AM, Max Reitz wrote: > > Second, why are you truncating the file to offset if offset is smaller > than actual_size? That is always wrong. You are blindly assuming: > (1) that the image consists of only data and no metadata, because you're > using the guest disk size (offset) to shrink the host file > (2) that all that data is tightly packed at the beginning of the file > > (1) is obviously wrong. (2) is wrong as well, because clusters may be > spread all over the host file. We would need defragmentation to solve > that issue. > > Therefore, to shorten the host file, you'd need to walk through the > image and find the highest *host* cluster actually in use. Then you can > truncate to after that cluster. However, I'd strongly advise against > that because it doesn't bring any actual benefit.
Well, there is a minor benefit - the 'query-blockstats' QMP command reports wr_highest_offset, which IS the offset of the highest host cluster already in use, but right now, we only populate that field on the first allocating write, whereas I would like to be able to get at that information even for a read-only file. See also Max's thread for optimizing overlap checks, where I mention this same thing as a side-effect we would get for free when improving overlap checks. > So, as for what I think we do need to do when shrinking (and keep in > mind: The offset given to qcow2_truncate() is the guest size! NOT the > host image size!): > > (1) Determine the first L2 table and the first entry in the table which > will lie beyond the new guest disk size. > (2) Discard all clusters beginning from there. > (3) Discard all L2 tables which are then completely empty. You may want to also consider discarding refblocks when completely empty, and truncating the size of the reftable if enough trailing refblocks are dropped. On the other hand, just as Max argued that shrinking the L1 table is going to be in the noise, you are also going to be in the noise for trying to shrink the reftable. > (4) Update the header size. > > And that's it. We can either speed up step (2) by implementing it > manually, or we just use bdrv_discard() on the qcow2 BDS (in the > simplest case: bdrv_discard(bs, DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE), > bs->total_sectors - DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE));. > > We can incorporate step (3) by extending qcow2_discard_clusters() to > free L2 tables when they are empty after discard_single_l2(). But we > don't even have to that now. It's an optimization we can go about later. Same for trimming unused refblocks and/or shrinking the reftable. Also, I think that a future addition of a defragmentation pass would be a more ideal place for tightly packing an image, including trimming reftables to a bare minimum. > > There is one advantage: > - It's extremely simple. It's literally below ten lines of code. > > I think the advantage far outweighs the disadvantage. But I may be > wrong. What do you think? I also like the much simpler approach of leaving holes. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature