On 2018-04-24 00:33, Eric Blake wrote: > Our code was already checking that we did not attempt to > allocate more clusters than what would fit in an INT64 (the > physical maximimum if we can access a full off_t's worth of > data). But this does not catch smaller limits enforced by > various spots in the qcow2 image description: L1 and normal > clusters of L2 are documented as having bits 63-56 reserved > for other purposes, capping our maximum offset at 64PB (bit > 55 is the maximum bit set). And for compressed images with > 2M clusters, the cap drops the maximum offset to bit 48, or > a maximum offset of 512TB. If we overflow that offset, we > would write compressed data into one place, but try to > decompress from another, which won't work. > > I don't have 512TB handy to prove whether things break if we > compress so much data that we overflow that limit, and don't > think that iotests can (quickly) test it either. Test 138 > comes close (it corrupts an image into thinking something lives > at 32PB, which is half the maximum for L1 sizing - although > it relies on 512-byte clusters). But that test points out > that we will generally hit other limits first (such as running > out of memory for the refcount table, or exceeding file system > limits like 16TB on ext4, etc), so this is more a theoretical > safety valve than something likely to be hit.
You don't need 512 TB, though, 36 MB is sufficient. Here's what you do: (1) Create a 513 TB image with cluster_size=2M,refcount_bits=1 (2) Take a hex editor and enter 16 refblocks into the reftable (3) Fill all of those refblocks with 1s (Funny side note: qemu-img check thinks that image is clean because it doesn't check refcounts beyond the image end...) I've attached a compressed test image (unsurprisingly, it compresses really well). Before this series: $ ./qemu-io -c 'write -c 0 2M' test.qcow2 qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with refcount block); further corruption events will be suppressed write failed: Input/output error Aw. After this series: $ ./qemu-io -c 'write -c 0 2M' test.qcow2 write failed: Input/output error (Normal writes just work fine.) Maybe you want to add a test still -- creating the image is rather quick (well, you have to write 64 MB of 1s, but other than that). The only thing that takes a bit of time is qemu figuring out where the first free cluster is... That takes like 15 seconds here. And another issue of course is... $ ls -lhs test.qcow2 42M -rw-r--r--. 1 maxx maxx 513T 25. Apr 16:42 test.qcow2 Yeah, that. Depends on the host file system, of course, whether that is a real issue. O:-) Max > Signed-off-by: Eric Blake <ebl...@redhat.com> > Reviewed-by: Alberto Garcia <be...@igalia.com> > > --- > v3: use s->cluster_offset_mask instead of open-coding it [Berto], > use MIN() to clamp offset on small cluster size, add spec patch > first to justify clamping even on refcount allocations > --- > block/qcow2.h | 6 ++++++ > block/qcow2-refcount.c | 21 ++++++++++++++------- > 2 files changed, 20 insertions(+), 7 deletions(-) > > diff --git a/block/qcow2.h b/block/qcow2.h > index 1df15a18aa1..a3b9faa9d53 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -41,6 +41,12 @@ > #define QCOW_MAX_CRYPT_CLUSTERS 32 > #define QCOW_MAX_SNAPSHOTS 65536 > > +/* Field widths in qcow2 mean normal cluster offsets cannot reach > + * 64PB; depending on cluster size, compressed clusters can have a > + * smaller limit (64PB for up to 16k clusters, then ramps down to > + * 512TB for 2M clusters). */ > +#define QCOW_MAX_CLUSTER_OFFSET ((1ULL << 56) - 1) > + > /* 8 MB refcount table is enough for 2 PB images at 64k cluster size > * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ > #define QCOW_MAX_REFTABLE_SIZE 0x800000 > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index 6bfc11bb48f..fcbea3c9644 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -31,7 +31,8 @@ > #include "qemu/bswap.h" > #include "qemu/cutils.h" > > -static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size); > +static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size, > + uint64_t max); > static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, > int64_t offset, int64_t length, uint64_t addend, > bool decrease, enum qcow2_discard_type type); > @@ -362,7 +363,8 @@ static int alloc_refcount_block(BlockDriverState *bs, > } > > /* Allocate the refcount block itself and mark it as used */ > - int64_t new_block = alloc_clusters_noref(bs, s->cluster_size); > + int64_t new_block = alloc_clusters_noref(bs, s->cluster_size, > + QCOW_MAX_CLUSTER_OFFSET); > if (new_block < 0) { > return new_block; > } > @@ -954,7 +956,8 @@ int qcow2_update_cluster_refcount(BlockDriverState *bs, > > > /* return < 0 if error */ > -static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size) > +static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size, > + uint64_t max) > { > BDRVQcow2State *s = bs->opaque; > uint64_t i, nb_clusters, refcount; > @@ -979,9 +982,9 @@ retry: > } > > /* Make sure that all offsets in the "allocated" range are representable > - * in an int64_t */ > + * in the requested max */ > if (s->free_cluster_index > 0 && > - s->free_cluster_index - 1 > (INT64_MAX >> s->cluster_bits)) > + s->free_cluster_index - 1 > (max >> s->cluster_bits)) > { > return -EFBIG; > } > @@ -1001,7 +1004,7 @@ int64_t qcow2_alloc_clusters(BlockDriverState *bs, > uint64_t size) > > BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC); > do { > - offset = alloc_clusters_noref(bs, size); > + offset = alloc_clusters_noref(bs, size, QCOW_MAX_CLUSTER_OFFSET); > if (offset < 0) { > return offset; > } > @@ -1083,7 +1086,11 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int > size) > free_in_cluster = s->cluster_size - offset_into_cluster(s, offset); > do { > if (!offset || free_in_cluster < size) { > - int64_t new_cluster = alloc_clusters_noref(bs, s->cluster_size); > + int64_t new_cluster; > + > + new_cluster = alloc_clusters_noref(bs, s->cluster_size, > + MIN(s->cluster_offset_mask, > + QCOW_MAX_CLUSTER_OFFSET)); > if (new_cluster < 0) { > return new_cluster; > } >
test.qcow2.xz
Description: application/xz
signature.asc
Description: OpenPGP digital signature