Whether the hypervisor snapshot happens depends on whether the 'quiesce' option 
is specified with the snapshot request. If a user doesn't care about the 
consistency of their backup, then the hypervisor snapshot/quiesce step can be 
skipped altogether. This of course is not the case if the default provider is 
being used, in which case a hypervisor snapshot is the only way of creating a 
backup since it can't be offloaded to the storage driver.

-- 
Chris Suich
chris.su...@netapp.com
NetApp Software Engineer
Data Center Platforms – Cloud Solutions
Citrix, Cisco & Red Hat

On Oct 8, 2013, at 4:57 PM, Darren Shepherd <darren.s.sheph...@gmail.com>
 wrote:

> Who is going to decide whether the hypervisor snapshot should actually
> happen or not? Or how?
> 
> Darren
> 
> On Tue, Oct 8, 2013 at 12:38 PM, SuichII, Christopher
> <chris.su...@netapp.com> wrote:
>> 
>> --
>> Chris Suich
>> chris.su...@netapp.com
>> NetApp Software Engineer
>> Data Center Platforms – Cloud Solutions
>> Citrix, Cisco & Red Hat
>> 
>> On Oct 8, 2013, at 2:24 PM, Darren Shepherd <darren.s.sheph...@gmail.com> 
>> wrote:
>> 
>>> So in the implementation, when we say "quiesce" is that actually being
>>> implemented as a VM snapshot (memory and disk).  And then when you say
>>> "unquiesce" you are talking about deleting the VM snapshot?
>> 
>> If the VM snapshot is not going to the hypervisor, then yes, it will 
>> actually be a hypervisor snapshot. Just to be clear, the unquiesce is not 
>> quite a delete - it is a collapse of the VM snapshot and the active VM back 
>> into one file.
>> 
>>> 
>>> In NetApp, what are you snapshotting?  The whole netapp volume (I
>>> don't know the correct term), a file on NFS, an iscsi volume?  I don't
>>> know a whole heck of a lot about the netapp snapshot capabilities.
>> 
>> Essentially we are using internal APIs to create file level backups - don't 
>> worry too much about the terminology.
>> 
>>> 
>>> I know storage solutions can snapshot better and faster than
>>> hypervisors can with COW files.  I've personally just been always
>>> perplexed on whats the best way to implement it.  For storage
>>> solutions that are block based, its really easy to have the storage
>>> doing the snapshot.  For shared file systems, like NFS, its seems way
>>> more complicated as you don't want to snapshot the entire filesystem
>>> in order to snapshot one file.
>> 
>> With filesystems like NFS, things are certainly more complicated, but that 
>> is taken care of by our controller's operating system, Data ONTAP, and we 
>> simply use APIs to communicate with it.
>> 
>>> 
>>> Darren
>>> 
>>> On Tue, Oct 8, 2013 at 11:10 AM, SuichII, Christopher
>>> <chris.su...@netapp.com> wrote:
>>>> I can comment on the second half.
>>>> 
>>>> Through storage operations, storage providers can create backups much 
>>>> faster than hypervisors and over time, their snapshots are more efficient 
>>>> than the snapshot chains that hypervisors create. It is true that a VM 
>>>> snapshot taken at the storage level is slightly different as it would be 
>>>> psuedo-quiesced, not have it's memory snapshotted. This is accomplished 
>>>> through hypervisor snapshots:
>>>> 
>>>> 1) VM snapshot request (lets say VM 'A'
>>>> 2) Create hypervisor snapshot (optional)
>>>> -VM 'A' is snapshotted, creating active VM 'A*'
>>>> -All disk traffic now goes to VM 'A*' and A is a snapshot of 'A*'
>>>> 3) Storage driver(s) take snapshots of each volume
>>>> 4) Undo hypervisor snapshot (optional)
>>>> -VM snapshot 'A' is rolled back into VM 'A*' so the hypervisor snapshot no 
>>>> longer exists
>>>> 
>>>> Now, a couple notes:
>>>> -The reason this is optional is that not all users necessarily care about 
>>>> the memory or disk consistency of their VMs and would prefer faster 
>>>> snapshots to consistency.
>>>> -Preemptively, yes, we are actually taking hypervisor snapshots which 
>>>> means there isn't actually a performance of taking storage snapshots when 
>>>> quiescing the VM. However, the performance gain will come both during 
>>>> restoring the VM and during normal operations as described above.
>>>> 
>>>> Although you can think of it as a poor man's VM snapshot, I would think of 
>>>> it more as a consistent multi-volume snapshot. Again, the difference being 
>>>> that this snapshot was not truly quiesced like a hypervisor snapshot would 
>>>> be.
>>>> 
>>>> --
>>>> Chris Suich
>>>> chris.su...@netapp.com
>>>> NetApp Software Engineer
>>>> Data Center Platforms – Cloud Solutions
>>>> Citrix, Cisco & Red Hat
>>>> 
>>>> On Oct 8, 2013, at 1:47 PM, Darren Shepherd <darren.s.sheph...@gmail.com> 
>>>> wrote:
>>>> 
>>>>> My only comment is that having the return type as boolean and using to
>>>>> that indicate quiesce behaviour seems obscure and will probably lead
>>>>> to a problem later.  Your basically saying the result of the
>>>>> takeVMSnapshot will only ever need to communicate back whether
>>>>> unquiesce needs to happen.  Maybe some result object would be more
>>>>> extensible.
>>>>> 
>>>>> Actually, I think I have more comments.  This seems a bit odd to me.
>>>>> Why would a storage driver in ACS implement a VM snapshot
>>>>> functionality?  VM snapshot is a really a hypervisor orchestrated
>>>>> operation.  So it seems like were trying to implement a poor mans VM
>>>>> snapshot.  Maybe if I understood what NetApp was trying to do it would
>>>>> make more sense, but its all odd.  To do a proper VM snapshot you need
>>>>> to snapshot memory and disk at the exact same time.  How are we going
>>>>> to do that if ACS is orchestrating the VM snapshot and delegating to
>>>>> storage providers.  Its not like you are going to pause the VM.... or
>>>>> are you?
>>>>> 
>>>>> Darren
>>>>> 
>>>>> On Mon, Oct 7, 2013 at 11:59 AM, Edison Su <edison...@citrix.com> wrote:
>>>>>> I created a design document page at 
>>>>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Pluggable+VM+snapshot+related+operations,
>>>>>>  feel free to add items on it.
>>>>>> And a new branch "pluggable_vm_snapshot" is created.
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: SuichII, Christopher [mailto:chris.su...@netapp.com]
>>>>>>> Sent: Monday, October 07, 2013 10:02 AM
>>>>>>> To: <dev@cloudstack.apache.org>
>>>>>>> Subject: Re: [DISCUSS] Pluggable VM snapshot related operations?
>>>>>>> 
>>>>>>> I'm a fan of option 2 - this gives us the most flexibility (as you 
>>>>>>> stated). The
>>>>>>> option is given to completely override the way VM snapshots work AND
>>>>>>> storage providers are given to opportunity to work within the default VM
>>>>>>> snapshot workflow.
>>>>>>> 
>>>>>>> I believe this option should satisfy your concern, Mike. The snapshot 
>>>>>>> and
>>>>>>> quiesce strategy would be in charge of communicating with the 
>>>>>>> hypervisor.
>>>>>>> Storage providers should be able to leverage the default strategies and
>>>>>>> simply perform the storage operations.
>>>>>>> 
>>>>>>> I don't think it should be much of an issue that new method to the 
>>>>>>> storage
>>>>>>> driver interface may not apply to everyone. In fact, that is already 
>>>>>>> the case.
>>>>>>> Some methods such as un/maintain(), attachToXXX() and takeSnapshot() are
>>>>>>> already not implemented by every driver - they just return false when 
>>>>>>> asked
>>>>>>> if they can handle the operation.
>>>>>>> 
>>>>>>> --
>>>>>>> Chris Suich
>>>>>>> chris.su...@netapp.com
>>>>>>> NetApp Software Engineer
>>>>>>> Data Center Platforms - Cloud Solutions
>>>>>>> Citrix, Cisco & Red Hat
>>>>>>> 
>>>>>>> On Oct 5, 2013, at 12:11 AM, Mike Tutkowski 
>>>>>>> <mike.tutkow...@solidfire.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Well, my first thought on this is that the storage driver should not
>>>>>>>> be telling the hypervisor to do anything. It should be responsible for
>>>>>>>> creating/deleting volumes, snapshots, etc. on its storage system only.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Oct 4, 2013 at 5:57 PM, Edison Su <edison...@citrix.com> wrote:
>>>>>>>> 
>>>>>>>>> In 4.2, we added VM snapshot for Vmware/Xenserver. The current
>>>>>>>>> workflow will be like the following:
>>>>>>>>> createVMSnapshot api -> VMSnapshotManagerImpl: creatVMSnapshot ->
>>>>>>>>> send CreateVMSnapshotCommand to hypervisor to create vm snapshot.
>>>>>>>>> 
>>>>>>>>> If anybody wants to change the workflow, then need to either change
>>>>>>>>> VMSnapshotManagerImpl directly or subclass VMSnapshotManagerImpl.
>>>>>>>>> Both are not the ideal choice, as VMSnapshotManagerImpl should be
>>>>>>>>> able to handle different ways to take vm snapshot, instead of hard 
>>>>>>>>> code.
>>>>>>>>> 
>>>>>>>>> The requirements for the pluggable VM snapshot coming from:
>>>>>>>>> Storage vendor may have their optimization, such as NetApp.
>>>>>>>>> VM snapshot can be implemented in a totally different way(For
>>>>>>>>> example, I could just send a command to guest VM, to tell my
>>>>>>>>> application to flush disk and hold disk write, then come to 
>>>>>>>>> hypervisor to
>>>>>>> take a volume snapshot).
>>>>>>>>> 
>>>>>>>>> If we agree on enable pluggable VM snapshot, then we can move on
>>>>>>>>> discuss how to implement it.
>>>>>>>>> 
>>>>>>>>> The possible options:
>>>>>>>>> 1. coarse grained interface. Add a VMSnapshotStrategy interface,
>>>>>>>>> which has the following interfaces:
>>>>>>>>> VMSnapshot takeVMSnapshot(VMSnapshot vmSnapshot);
>>>>>>>>> Boolean revertVMSnapshot(VMSnapshot vmSnapshot);
>>>>>>>>> Boolean DeleteVMSnapshot(VMSnapshot vmSnapshot);
>>>>>>>>> 
>>>>>>>>> The work flow will be: createVMSnapshot api ->
>>>>>>> VMSnapshotManagerImpl:
>>>>>>>>> creatVMSnapshot -> VMSnapshotStrategy: takeVMSnapshot
>>>>>>>>> VMSnapshotManagerImpl will manage VM state, do the sanity check,
>>>>>>>>> then will handle over to VMSnapshotStrategy.
>>>>>>>>> In VMSnapshotStrategy implementation, it may just send a
>>>>>>>>> Create/revert/delete VMSnapshotCommand to hypervisor host, or do
>>>>>>>>> anything special operations.
>>>>>>>>> 
>>>>>>>>> 2. fine-grained interface. Not only add a VMSnapshotStrategy
>>>>>>>>> interface, but also add certain methods on the storage driver.
>>>>>>>>> The VMSnapshotStrategy interface will be the same as option 1.
>>>>>>>>> Will add the following methods on storage driver:
>>>>>>>>> /* volumesBelongToVM  is the list of volumes of the VM that created
>>>>>>>>> on this storage, storage vendor can either take one snapshot for this
>>>>>>>>> volumes in one shot, or take snapshot for each volume separately
>>>>>>>>>    The pre-condition: vm is unquiesced.
>>>>>>>>>    It will return a Boolean to indicate, do need unquiesce vm or not.
>>>>>>>>>    In the default storage driver, it will return false.
>>>>>>>>> */
>>>>>>>>> boolean takeVMSnapshot(List<VolumeInfo> volumesBelongToVM,
>>>>>>>>> VMSnapshot vmSnapshot);
>>>>>>>>> Boolean revertVMSnapshot(List<VolumeInfo> volumesBelongToVM,
>>>>>>>>> VMSnapshot vmSnapshot);
>>>>>>>>> Boolean deleteVMSnapshot(List<VolumeInfo> volumesBelongToVM,
>>>>>>>>> VMSnapshot vmSNapshot);
>>>>>>>>> 
>>>>>>>>> The work flow will be: createVMSnapshot api ->
>>>>>>> VMSnapshotManagerImpl:
>>>>>>>>> creatVMSnapshot -> VMSnapshotStrategy: takeVMSnapshot -> storage
>>>>>>>>> driver:takeVMSnapshot In the implementation of VMSnapshotStrategy's
>>>>>>>>> takeVMSnapshot, the pseudo code looks like:
>>>>>>>>>    HypervisorHelper.quiesceVM(vm);
>>>>>>>>>    val volumes = vm.getVolumes();
>>>>>>>>>    val maps = new Map[driver, list[VolumeInfo]]();
>>>>>>>>>    Volumes.foreach(volume => maps.put(volume.getDriver, volume ::
>>>>>>>>> maps.get(volume.getdriver())))
>>>>>>>>>    val needUnquiesce = true;
>>>>>>>>>     maps.foreach((driver, volumes) => needUnquiesce  =
>>>>>>>>> needUnquiesce && driver.takeVMSnapshot(volumes))
>>>>>>>>>   if (needUnquiesce ) {
>>>>>>>>>    HypervisorHelper.unquiesce(vm);
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> By default, the quiesceVM in HypervisorHelper will actually take vm
>>>>>>>>> snapshot through hypervisor.
>>>>>>>>> Does above logic makes senesce?
>>>>>>>>> 
>>>>>>>>> The pros of option 1 is that: it's simple, no need to change storage
>>>>>>>>> driver interfaces. The cons is that each storage vendor need to
>>>>>>>>> implement a strategy, maybe they will do the same thing.
>>>>>>>>> The pros of option 2 is that, storage driver won't need to worry
>>>>>>>>> about how to quiesce/unquiesce vm. The cons is that, it will add
>>>>>>>>> these methods on each storage drivers, so it assumes that this work
>>>>>>>>> flow will work for everybody.
>>>>>>>>> 
>>>>>>>>> So which option we should take? Or if you have other options, please
>>>>>>>>> let's know.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> *Mike Tutkowski*
>>>>>>>> *Senior CloudStack Developer, SolidFire Inc.*
>>>>>>>> e: mike.tutkow...@solidfire.com
>>>>>>>> o: 303.746.7302
>>>>>>>> Advancing the way the world uses the
>>>>>>>> cloud<http://solidfire.com/solution/overview/?video=play>
>>>>>>>> *(tm)*
>>>>>> 
>>>> 
>> 

Reply via email to