> > In short, FVD's internal snapshot achieves the ideal properties of G1-G6, > > by 1) using the reference count table to only track "static" snapshots, 2) > > not keeping the reference count table in memory, 3) not updating the > > on-disk "static" reference count table when the VM runs, and 4) > > efficiently tracking dynamically allocated blocks by piggybacking on FVD's > > other features, i.e., its journal and small one-level lookup table. > > Are you assuming snapshots are read-only? > > It's not clear to me how this would work with writeable snapshots. It's
> not clear to me that writeable snapshots are really that important, but > this is an advantage of having a refcount table. > > External snapshots are essentially read-only snapshots so I can > understand the argument for it. By definition, a snapshot itself must be immutable (read-only), but a writeable image state can be derived from an immutable snapshot by using copy-on-write, which I guess is what you meant by "writeable snapshot." Perhaps the following concrete use cases will make things clear. These use cases are supported by QCOW2, VMware, and FVD, regardless of the difference in their internal implementation. Suppose an image's initial state is: Image: (current-disk-state-observed-by-the-running-VM) Below, I simply refer to "current-disk-state-observed-by-the-running-VM" as "current-state." The VM issues writes and continuously modifies the "current-state". At one point in time, a snapshot s1 is taken, and the image becomes: Image: s1->(current-state) The VM issues more writes and subsequently takes three snapshots, s2, s3, and s4. Now the image becomes: Image: s1->s2->s3->s4->(current-state) Suppose the action "goto snapshot s2" is taken, which does not affect the immutable snapshots s1-s4, but the "current-state" is abandoned and lost. Now the image becomes: Image: s1->s2->s3->s4 |->(curren-state) (Note: depending on your email client, the two lines in the diagram may not be properly aligned). The new "current-state" is writeable and is derived from the immutable snapshot s2. When the VM issues a write, it does copy-on-write and stores dirty data in the "current-state" without modifying the original snapshot s2. Perhaps this is what you meant by "writeable snapshot"? The diagram above is at the conceptual level. In implementation, both QCOW2 and FVD store all snapshots s1-s4 and the current-state in one image file, and the snapshots and curren-state may share data chunks. Suppose the VM issues some writes and subsequently takes two snapshots, s5 and s6. Now the image becomes: Image: s1->s2->s3->s4 |->s5->s6->(curren-state) Suppose the action "goto snapshot s2" is taken again. Now the image becomes: Image: s1->s2->s3->s4 |->s5->s6 |->(current-state) The new "current-state" is writeable and is derived from the immutable snapshot s2. Right after the "goto" action, the running VM sees the state of s2, instead of the state of s5 created after the first "goto snapshot s2" action. Again, this is because a snapshot itself is immutable. Again, all the use cases are supported by QCOW2, VMware, and FVD, regardless of the difference in their internal implementation. Now let's come back to the discussion of FVD. Perhaps my description in the previous email is not clear. In the diagrams above, FVD's reference count table only tracks the snapshots (s1, s2, ...), but does not track the "current-state". Instead, FVD's default mechanism (one-level lookup table, journal, etc.), which exists even before introducing snapshot, already tracks the "current-state". Working together, FVD's reference count table and its default mechanism tracks all the states. In QCOW2, when a new cluster is allocated during handling a running VM's write request, it updates both the lookup table and the reference count table, which is unnecessary because their information is redundant. By contrast, in FVD, when a new chunk is allocated during handling a running VM's write request, it only updates the lookup table without updating the reference count table, because by design the reference count table does not track the "current-state" and this chunk allocation operation belongs to the "current-state." This is the key why FVD can get all the functions of QCOW2's internal snapshot but without its memory overhead to cache the reference count table and its disk I/O overhead to read or write the reference count table during normal execution of VM. Regards, ChunQiang (CQ) Tang Homepage: http://www.research.ibm.com/people/c/ctang