On 9/18/18 3:45 AM, lampahome wrote:
Both values correspond to L2 entries with bit 0 set. However,
QCOW2_CLUSTER_ZERO_ALLOC is an entry that has a non-zero value in bits 9-55
(the cluster has an allocated host location, we guarantee that things read
as zero regardless of whether the host data actually contains zeroes at
that offset, and writes go directly to that offset with no further
allocation required); while QCOW2_CLUSTER_ZERO_PLAIN is an entry with all
zeros in bits 9-55 (we guarantee things read as zero, but writes have to
allocate a new cluster because we have not reserved any space in the host
yet).
If I let one entry called l2_addr of l2 table is 1(also the
QCOW2_CLUSTER_ZERO_PLAIN)
to make it as discard.
Rather, marking a cluser as QCOW2_CLUSTER_ZERO_PLAIN makes that cluster
have read-as-zero semantics. Another option for discard would be
writing 0 to the l2 table to make the cluster defer to the backing file
(that is what is done when you use 'qemu-img commit', but not something
currently accessible to QMP commands on a live guest).
Note that when there is no backing file, marking a cluster as
unallocated (l2 entry of 0) vs. read-as-zero (l2 entry of 1) has
identical guest-visible behavior; the only time you can tell the two
apart is when there is a backing file. But when there IS a backing file,
marking a cluster as defer-to-backing means that reads from that area
now revived the contents of the disk from the backing file. Although
read-after-discard is undefined (guests should NOT be relying on any
specific data to be present after a discard - after all it is advisory),
the two most common behaviors of discard are 1) no-op (you read what was
there before the discard) 2) read zeros (you get a stable read).
Marking a cluster as read-as-zero achieves option 2, but marking it as
unallocated defer-to-backing would be a third behavior, 3) read stale
data from some previous point in time. Since a guest might be trying to
use discard to clean up sensitive data (even though such an attempt is
not guaranteed to work, since discard is advisory), it is safer to avoid
behavior 3 as it potentially leaks data to the guest that it previously
thought was indeed discarded.
After I run qemu-img commit image, and the l2_addr also commit to its
backing file.
But I saw the same entry l2_addr of l2 table in backing file doesn't show
1, and write corresponding cluster with zero.
Is that normal?
It is normal for committing a read-as-zero cluster to a backing image to
cause the backing image to also read as zeros. Whether it is done by
actually wiping out the cluster in the backing file, or by merely
setting the read-as-zero bit on the l2 entry but otherwise leaving the
cluster allocated, is an implementation detail that shouldn't affect
guest behavior (but may need a tuning knob to affect host allocation
behavior, so that you can choose between keeping an image fully
allocated, vs. aggressively trying to keep the image sparse). You are
welcome to try and submit patches to add such knobs.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org