On 2017-07-13 10:41, Kevin Wolf wrote: > Am 12.07.2017 um 18:58 hat Max Reitz geschrieben: >> On 2017-07-12 16:52, Kevin Wolf wrote: >>> Am 12.07.2017 um 13:46 hat Pavel Butsykin geschrieben: >>>> This patch add shrinking of the image file for qcow2. As a result, this >>>> allows >>>> us to reduce the virtual image size and free up space on the disk without >>>> copying the image. Image can be fragmented and shrink is done by punching >>>> holes >>>> in the image file. >>>> >>>> Signed-off-by: Pavel Butsykin <pbutsy...@virtuozzo.com> >>>> Reviewed-by: Max Reitz <mre...@redhat.com> >>>> --- >>>> block/qcow2-cluster.c | 40 ++++++++++++++++++ >>>> block/qcow2-refcount.c | 110 >>>> +++++++++++++++++++++++++++++++++++++++++++++++++ >>>> block/qcow2.c | 43 +++++++++++++++---- >>>> block/qcow2.h | 14 +++++++ >>>> qapi/block-core.json | 3 +- >>>> 5 files changed, 200 insertions(+), 10 deletions(-) >>>> >>>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c >>>> index f06c08f64c..518429c64b 100644 >>>> --- a/block/qcow2-cluster.c >>>> +++ b/block/qcow2-cluster.c >>>> @@ -32,6 +32,46 @@ >>>> #include "qemu/bswap.h" >>>> #include "trace.h" >>>> >>>> +int qcow2_shrink_l1_table(BlockDriverState *bs, uint64_t exact_size) >>>> +{ >>>> + BDRVQcow2State *s = bs->opaque; >>>> + int new_l1_size, i, ret; >>>> + >>>> + if (exact_size >= s->l1_size) { >>>> + return 0; >>>> + } >>>> + >>>> + new_l1_size = exact_size; >>>> + >>>> +#ifdef DEBUG_ALLOC2 >>>> + fprintf(stderr, "shrink l1_table from %d to %d\n", s->l1_size, >>>> new_l1_size); >>>> +#endif >>>> + >>>> + BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_WRITE_TABLE); >>>> + ret = bdrv_pwrite_zeroes(bs->file, s->l1_table_offset + >>>> + sizeof(uint64_t) * new_l1_size, >>>> + (s->l1_size - new_l1_size) * >>>> sizeof(uint64_t), 0); >>>> + if (ret < 0) { >>>> + return ret; >>>> + } >>>> + >>>> + ret = bdrv_flush(bs->file->bs); >>>> + if (ret < 0) { >>>> + return ret; >>>> + } >>> >>> If we have an error here (or after a partial bdrv_pwrite_zeroes()), we >>> have entries zeroed out on disk, but in memory we still have the >>> original L1 table. >>> >>> Should the in-memory L1 table be zeroed first? Then we can't >>> accidentally reuse stale entries, but would have to allocate new ones, >>> which would get on-disk state and in-memory state back in sync again. >> >> Yes, I thought of the same. But this implies that the allocation is >> able to modify the L1 table, and I find that unlikely if that >> bdrv_flush() failed already... >> >> So I concluded not to have an opinion on which order is better. > > Well, let me ask the other way round: Is there any disadvantage in first > zeroing the in-memory table and then writing to the image?
I was informed that the code would be harder to write. :-) > If I have a choice between "always safe" and "not completely safe, but I > think it's unlikely to happen", especially in image formats, then I will > certainly take the "always safe". > >>>> + BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_FREE_L2_CLUSTERS); >>>> + for (i = s->l1_size - 1; i > new_l1_size - 1; i--) { >>>> + if ((s->l1_table[i] & L1E_OFFSET_MASK) == 0) { >>>> + continue; >>>> + } >>>> + qcow2_free_clusters(bs, s->l1_table[i] & L1E_OFFSET_MASK, >>>> + s->cluster_size, QCOW2_DISCARD_ALWAYS); >>>> + s->l1_table[i] = 0; >>>> + } >>>> + return 0; >>>> +} >>>> + >>>> int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, >>>> bool exact_size) >>>> { >>> >>> I haven't checked qcow2_shrink_reftable() for similar kinds of problems, >>> I hope Max has. >> >> Well, it's exactly the same there. > > Ok, so I'll object to this code without really having looked at it. I won't object to your objection. O:-) Max
signature.asc
Description: OpenPGP digital signature