On Jan 24, 2014, at 9:05 AM, Jon Bernard <[email protected]> wrote:
> * Vishvananda Ishaya <[email protected]> wrote: >> >> On Jan 16, 2014, at 1:28 PM, Jon Bernard <[email protected]> wrote: >> >>> * Vishvananda Ishaya <[email protected]> wrote: >>>> >>>> On Jan 14, 2014, at 2:10 PM, Jon Bernard <[email protected]> wrote: >>>> >>>>> >>>>> <snip> >>>>>> As you’ve defined the feature so far, it seems like most of it could >>>>>> be implemented client side: >>>>>> >>>>>> * pause the instance >>>>>> * snapshot the instance >>>>>> * snapshot any attached volumes >>>>> >>>>> For the first milestone to offer crash-consistent snapshots you are >>>>> correct. We'll need some additional support from libvirt, but the >>>>> patchset should be straightforward. The biggest question I have >>>>> surrounding initial work is whether to use an existing API call or >>>>> create a new one. >>>>> >>>> >>>> I think you might have missed the “client side” part of this point. I agree >>>> that the snapshot multiple volumes and package it up is valuable, but I was >>>> trying to make the point that you could do all of this stuff client side >>>> if you just add support for snapshotting ephemeral drives. An all-in-one >>>> snapshot command could be valuable, but you are talking about orchestrating >>>> a lot of commands between nova, glance, and cinder and it could get kind >>>> of messy to try to run the whole thing from nova. >>> >>> If you expose each primitive required, then yes, the client could >>> implement the logic to call each primitive in the correct order, handle >>> error conditions, and exit while leaving everything in the correct >>> state. But that would mean you would have to implement it twice - once >>> in python-novaclient and once in Horizon. I would speculate that doing >>> this on the client would be even messier. >>> >>> If you are concerned about the complexity of the required interactions, >>> we could narrow the focus in this way: >>> >>> Let's say that taking a full snapshot/backup (all volumes) operates >>> only on persistent storage volumes. Users who booted from an >>> ephemeral glance image shouldn't expect this feature because, by >>> definition, the boot volume is not expected to live a long life. >>> >>> This should limit the communication to Nova and Cinder, while leaving >>> Glance out (initially). If the user booted an instance from a cinder >>> volume, then we have all the volumes necessary to create an OVA and >>> import to Glance as a final step. If the boot volume is an image then >>> I'm not sure, we could go in a few directions: >>> >>> 1. No OVA is imported due to lack of boot volume >>> 2. A copy of the original image is included as a boot volume to create >>> an OVA. >>> 3. Something else I am failing to see. >> >>> >>> If [2] seems plausible, then it probably makes sense to just ask glance >>> for an image snapshot from nova while the guest is in a paused state. >>> >>> Thoughts? >> >> This already exists. If you run a snapshot command on a volume backed >> instance >> it snapshots all attached volumes. Additionally it does throw a bootable >> image >> into glance referring to all of the snapshots. You could modify create image >> to do this for regular instances as well, specifying block device mapping but >> keeping the vda as an image. It could even do the same thing with the >> ephemeral >> disk without a ton of work. Keeping this all as one command makes a lot of >> sense >> except that it is unexpected. >> >> There is a benefit to only snapshotting the root drive sometimes because it >> keeps the image small. Here’s what I see as the ideal end state: >> >> Two commands(names are a strawman): >> create-full-image — image all drives >> create-root-image — image just the root drive >> >> These should work the same regardless of whether the root drive is volume >> backed >> instead of the craziness we have to day of volume-backed snapshotting all >> drives >> and instance backed just the root. I’m not sure how we manage expectations >> based >> on the current implementation but perhaps the best idea is just adding this >> in >> v3 with new names? >> >> FYI the whole OVA thing seems moot since we already have a way of >> representing >> multiple drives in glance via block_device_mapping properites. > > I've had some time to look closer at nova and rethink things a bit and > I see what you're saying. You are correct, taking snapshots of attached > volumes is currently supported - although not in the way that I would > like to see. And this is where I think we can improve. > > Let me first summarize my understanding of what we currently have. > There are three way of creating a snapshot-like thing in Nova: > > 1. create_image - takes a snapshot of the root volume and may take > snapshots of the attached volumes depending on the volume type of > the root volume. I/O is not quiesced. > > 2. create_backup - takes a snapshot of the root volume with options > to specify how often to repeat and how many previous snapshots to > keep around. I/O is not quiesced. > > 3. os-assisted-snapshot - takes a snapshot of a single cinder volume. > The volume is first quiesced before the snapshot is initiated. > > My general thesis is that I/O should be quiesced in all cases if the > underlying driver supports it. Libvirt supports this feature and > I would like to extend the existing functionality to take advantage of > it. > > It's not reasonable to change the names or behaviour of the existing > public api calls. Instead I would like to create a new snapshot() call > in the v3 API. > > We only need a quiesce() call added to the driver and the rest of the > implementation will live in the api layer. Once implemented, the > existing snapshot calls (image, backup, os-assisted) could use the > underlying snapshot routines to achieve their expected results. Leaving > us with only one set of snapshot-related functions to maintain. > > The new snapshot call would take at least one option: the drives that > should be snapshotted: > > snapshot(devices=['vda', 'vdb']) > > Where a value of None implies all volumes. > > This allows the user to snapshot only the root volume if a small > bootable image is desired. > > There will be no exclusion based on volume type, both glance and cinder > volumes will be snapshotted respectively. Otherwise we reach the > unexpected behaviour that you mentioned earlier and I agree, it would > have been confusing. > > The flow will look like: > > * call the compute node to quiesce > * call the compute node to snapshot each individual glance drive > * call the volume driver to snapshot each cinder volume > * package the whole thing > > The final result is an image in glance that references each attached > volume via its block device mapping. For a cinder-backed instance, the > glance image would contain no data and only references to cinder > snapshots. As far as I can tell, glance already supports these > requirements. > > If create_image and create_backup are updated to use this > implementation, then the behaviour will appear unchanged to the user > with the exception that I/O was quiesced during the snapshot(s) and they > therefore have a more reliable and useful result. > > Given this, I think it makes more sense to leave the implementation > within the api layer of Nova so that existing functions can share in the > implementation - as opposed to moving it into the client. > > What are your thoughts? Is this approaching something sensible? This is starting to look very sensible. I appreciate you putting a lot of thought into this. Vish > > -- > Jon > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
