On 2/5/20 8:25 AM, Vladimir Sementsov-Ogievskiy wrote:
But when you truncate them (with PREALLOC_MODE_OFF, as
BlockDriver.bdrv_has_zero_init_truncate()’s comment explains), the new
area is always going to be 0, regardless of initial preallocation.
ah yes, due to qcow2 zero clusters.
Hmm. Do we actually set the zero flag on unallocated clusters when
resizing a qcow2 image? That would be an O(n) operation (we have to
visit the L2 entry for each added cluster, even if only to set the
zero cluster bit). Or do we instead just rely on the fact that qcow2
is inherently sparse, and that when you resize the guest-visible size
without writing any new clusters, then it is only subsequent guest
access to those addresses that finally allocate clusters, making
resize O(1) (update the qcow2 metadata cluster, but not any L2 tables)
while still reading 0 from the new data. To some extent, that's what
the allocation mode is supposed to control.
We must mark as ZERO new cluster at least if there is a _larger_ backing
file, to prevent data from backing file become available for the guest.
But we don't do it. It's a bug and there is fixing series from Kevin,
I've just pinged it:
"[PATCH for-4.2? v3 0/8] block: Fix resize (extending) of short overlays"
There's a difference for a backing file larger than the qcow2 file, and
the protocol layer larger than the qcow2 file. Visually, with the
following four nodes:
f1 [qcow2 format] <- f2 [qcow2 format]
v v
p1 [file protocol] p2 [file protocol]
If f1 is larger than f2, then resizing f2 without writing zero clusters
leaks the data from f1 into f2. The block layer knows this: prior to
this series, bdrv_has_zero_init_truncate() returns 0 if bs->backing is
present; and even in this series, see patch 6/17 which continues to
force a 0 return rather than calling into the driver if the sizes are
suspect. But that is an uncommon corner case; in short, the qcow2
callback .bdrv_has_zero_init_truncate is NOT reachable in that scenario,
whether before or after this series.
On the other hand, if p2 is larger than f2, resizing f2 reads zeroes.
That's because qcow2 HAS to add L2 mappings into p2 before data from p2
can leak, but .bdrv_co_truncate(PREALLOC_MODE_OFF) does not add any L2
mappings. Thus, qcow2 blindly returning 1 for
.bdrv_has_zero_init_truncate was correct (other than the anomaly of
bs->encrypted, also fixed earlier in this series).
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org