Re: [Qemu-devel] Different type of qcow2_get_cluster_type

Eric Blake Tue, 18 Sep 2018 06:51:01 -0700

On 9/18/18 3:45 AM, lampahome wrote:



Both values correspond to L2 entries with bit 0 set.  However,
QCOW2_CLUSTER_ZERO_ALLOC is an entry that has a non-zero value in bits 9-55
(the cluster has an allocated host location, we guarantee that things read
as zero regardless of whether the host data actually contains zeroes at
that offset, and writes go directly to that offset with no further
allocation required); while QCOW2_CLUSTER_ZERO_PLAIN is an entry with all
zeros in bits 9-55 (we guarantee things read as zero, but writes have to
allocate a new cluster because we have not reserved any space in the host
yet).


If I let one entry called l2_addr of l2 table is 1(also the
QCOW2_CLUSTER_ZERO_PLAIN)
to make it as discard.

Rather, marking a cluser as QCOW2_CLUSTER_ZERO_PLAIN makes that clusterhave read-as-zero semantics. Another option for discard would bewriting 0 to the l2 table to make the cluster defer to the backing file(that is what is done when you use 'qemu-img commit', but not somethingcurrently accessible to QMP commands on a live guest).

Note that when there is no backing file, marking a cluster asunallocated (l2 entry of 0) vs. read-as-zero (l2 entry of 1) hasidentical guest-visible behavior; the only time you can tell the twoapart is when there is a backing file. But when there IS a backing file,marking a cluster as defer-to-backing means that reads from that areanow revived the contents of the disk from the backing file. Althoughread-after-discard is undefined (guests should NOT be relying on anyspecific data to be present after a discard - after all it is advisory),the two most common behaviors of discard are 1) no-op (you read what wasthere before the discard) 2) read zeros (you get a stable read).Marking a cluster as read-as-zero achieves option 2, but marking it asunallocated defer-to-backing would be a third behavior, 3) read staledata from some previous point in time. Since a guest might be trying touse discard to clean up sensitive data (even though such an attempt isnot guaranteed to work, since discard is advisory), it is safer to avoidbehavior 3 as it potentially leaks data to the guest that it previouslythought was indeed discarded.


After I run qemu-img commit image, and the l2_addr also commit to its
backing file.

But I saw the same entry l2_addr of l2 table in backing file doesn't show
1, and write corresponding cluster with zero.

Is that normal?

It is normal for committing a read-as-zero cluster to a backing image tocause the backing image to also read as zeros. Whether it is done byactually wiping out the cluster in the backing file, or by merelysetting the read-as-zero bit on the l2 entry but otherwise leaving thecluster allocated, is an implementation detail that shouldn't affectguest behavior (but may need a tuning knob to affect host allocationbehavior, so that you can choose between keeping an image fullyallocated, vs. aggressively trying to keep the image sparse). You arewelcome to try and submit patches to add such knobs.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] Different type of qcow2_get_cluster_type

Reply via email to