Re: [Qemu-devel] [RFC] live snapshot, live merge, live block migration

Jagane Sundar Sun, 22 May 2011 22:43:46 -0700

Hello Stefan,

I have been thinking about this since you sent out this message.
A quick look at the libvirt API indicates that their notion of a
snapshot often refers to a "disk+memory snapshot". It would
be good to provide feedback to the libvirt developers to make
sure that proper support for a 'disk only snapshot' capability is
included.


You might have already seen this, but here's a email chain from
the libvirt mailing list that's relevant:

http://www.redhat.com/archives/libvir-list/2010-March/msg01389.html

I am very interested in enhancing libvirt to support
the Livebackup semantics, for the following reason:
If libvirt can be enhanced to support all the constructs
required for full Livebackup functionality, then I would like to
remove the built-in livebackup network protocol, and rewrite
the client such that it is a native program on the VM host linked
with libvirt, and can perform a full or incremental backup using
libvirt. If a remote backup needs to be performed, then I would
require the remote client to ssh into the VM host, and then
run the local backup and pipe back to the remote backup host.
This way I would not need to deal with authentication of
livebackup client and server, and encryption of the network
connection.

Please see my feedback regarding the specific operations below:

On 5/20/2011 5:19 AM, Stefan Hajnoczi wrote:

I'm interested in what the API for snapshots would look like.
Specifically how does user software do the following:
1. Create a snapshot

For livebackup, one parameter that is required is the 'full' or
'incremental' backup parameter. If the param is 'incremental'
then only the blocks that were modified since the last snapshot
command was issued are part of the snapshot. If the param
is 'full', the the snapshot includes all the blocks of all the disks
in the VM.

2. Delete a snapshot

Simple for livebackup, since no more than one snapshot is
allowed. Hence naming is a non-issue. As is deleting.

3. List snapshots

Again, simple for livebackup, on account of the one
active snapshot restriction.

4. Access data from a snapshot

In traditional terms, access could mean many
things. Some examples:
1. Access lists a set of files on the local
    file system of the VM Host. A small VM
   may be started up, and mount these
   snapshot files as a set of secondary drives
2. Publish the snapshot drives as iSCSI LUNs.
3. If the origin drives are on a Netapp filer,
    perhaps a filer snapshot is created, and
    a URL describing that snapshot is printed
    out.

Access, in Livebackup terms, is merely copying
dirty blocks over from qemu. Livebackup does
not provide a random access mode - i.e. one
where a VM could be started using the snapshot.

Currently, Livebackup uses 4K clusters of 512 byte
blocks. 'Dirty clusters' are transferred over by the
client supplying a 'cluster number' param, and qemu
returning the next 'n' number of contiguous dirty
clusters. At the end, qemu returns a 'no-more-dirty'
error.

5. Restore a VM from a snapshot


Additional info for re-creating the VM needs to be
saved when a snapshot is saved. The origin VM's
libvirt XML desciptor should probably be saved
along with the snapshot.

6. Get the dirty blocks list (for incremental backup)

Either a complete dump of the dirty blocks, or a way
to iterate through the dirty blocks and fetch them
needs to be provided. My preference is to use the
iterate through the dirty blocks approach, since
that will enable the client to pace the backup
process and provide guarantees such as 'no more
than 10% of the network b/w will be utilized for
backup'.

We've discussed image format-level approaches but I think the scope of
the API should cover several levels at which snapshots are
implemented:
1. Image format - image file snapshot (Jes, Jagane)

Livebackup uses qcow2 to save the Copy-On-Write blocks
that are dirtied by the VM when the snapshot is active.

2. Host file system - ext4 and btrfs snapshots

I have tested with ext4 and raw LVM volumes for the origin
virtual disk files. The qcow2 COW files have only resided on
ext4.

3. Storage system - LVM or SAN volume snapshots

It will be hard to take advantage of more efficient host file system
or storage system snapshots if they are not designed in now.

I agree. A snapshot and restore from backup should not result in
the virtual disk file getting inflated (going from sparse to fully
allocated, for example).

Is anyone familiar enough with the libvirt storage APIs to draft an
extension that adds snapshot support?  I will take a stab at it if no
one else want to try it.

I have only looked at it briefly, after getting your email message.
If you can take a deeper look at it, I would be willing to work with
you to iron out details.

Thanks,
Jagane

Re: [Qemu-devel] [RFC] live snapshot, live merge, live block migration

Reply via email to