Yeah, I'm not really clear how the snapshot strategy works if you have
multiple vendors that implement that interface either.


On Wed, Oct 9, 2013 at 10:12 PM, Darren Shepherd <
darren.s.sheph...@gmail.com> wrote:

> Edison,
>
> I would lean toward doing the coarse grain interface only.  I'm having
> a hard time seeing how the whole flow is generic and makes sense for
> everyone.  With starting with the coarse grain you have the advantage
> in that you avoid possible upfront over engineering/over design that
> could wreak havoc down the line.  If you implement the
> VMSnapshotStrategy and find that it really is useful to other
> implementations, you can then implement the fine grain interface later
> to allow others to benefit from it.
>
> Darren
>
> On Wed, Oct 9, 2013 at 8:54 PM, Mike Tutkowski
> <mike.tutkow...@solidfire.com> wrote:
> > Hey guys,
> >
> > I haven't been giving this thread much attention, but am reviewing it
> > somewhat now.
> >
> > I'm not really clear how this would work if, say, a VM has two data disks
> > and they are not being provided by the same vendor.
> >
> > Can someone clarify that for me?
> >
> > My understanding for how this works today is that it doesn't matter. For
> > XenServer, a VDI is on an SR, which could be supported by storage vendor
> X.
> > Another VDI could be on another SR, supported by storage vendor Y.
> >
> > In this case, a new VDI appears on each SR after a hypervisor snapshot.
> >
> > Same idea for VMware.
> >
> > I don't really know how (or if) this works for KVM.
> >
> > I'm not clear how this multi-vendor situation would play out in this
> > pluggable approach.
> >
> > Thanks!
> >
> >
> > On Tue, Oct 8, 2013 at 4:43 PM, Edison Su <edison...@citrix.com> wrote:
> >
> >>
> >>
> >> > -----Original Message-----
> >> > From: Darren Shepherd [mailto:darren.s.sheph...@gmail.com]
> >> > Sent: Tuesday, October 08, 2013 2:54 PM
> >> > To: dev@cloudstack.apache.org
> >> > Subject: Re: [DISCUSS] Pluggable VM snapshot related operations?
> >> >
> >> > A hypervisor snapshot will snapshot memory also.  So determining
> whether
> >> The memory is optional for hypervisor vm snapshot, a.k.a, the "Disk-only
> >> snapshots":
> >>
> http://support.citrix.com/proddocs/topic/xencenter-61/xs-xc-vms-snapshots-about.html
> >> It's supported by both xenserver/kvm/vmware.
> >>
> >> > do to the hypervisor snapshot from the quiesce option does not seem
> >> > proper.
> >> >
> >> > Sorry, for all the questions, I'm trying to get to the point of
> >> understand if this
> >> > functionality makes sense at this point of code or if maybe their is a
> >> different
> >> > approach.  This is what I'm seeing, what if we state it this way
> >> >
> >> > 1) VM snapshot, AFAIK, are not backed up today and exist solely on
> >> primary.
> >> > What if we added a backup phase to VM snapshots that can be optionally
> >> > supported by the storage providers to possibly backup the VM snapshot
> >> > volumes.
> >> It's not about backup vm snapshot, it's about how to take vm snapshot.
> >> Usually, take/revert vm snapshot is handled by hypervisor itself, but in
> >> NetApp(or other storage vendor) case,
> >> They want to change the default behavior of hypervisor-base vm snapshot.
> >>
> >> Some examples:
> >> 1. take hypervisor based vm snapshots, on primary storage, hypervisor
> will
> >> maintain the snapshot chain.
> >> 2. take vm snapshot through NetApp:
> >>      a. first, quiesce VM if user specified. There is no separate API to
> >> quiesce VM on the hypervisor, so here we will
> >> take a VM snapshot through hypervisor API call, hypervisor will take
> >> volume snapshot  on each volume of the VM. Let's say, on the primary
> >> storage, the disk chain looks like:
> >>            base-image
> >>                     |
> >>                     V
> >>                 Parent disk
> >>             /                         \
> >>           V                            V
> >>         Current disk        snapshot-a
> >>      b. from snapshot-a, find out its parent disk, then take snapshot
> >> through NetApp
> >>      c. un- quiesce VM, here, go to hypervisor, delete snapshot
> >> "snapshot-a", hypervisor should be able to consolidate current disk and
> >> "parent disk" into one disk, thus from hypervisor point of view
> >> , there is always, at most, only one snapshot for the VM.
> >>     For revert VM snapshot, as long as the VM is stopped, NetApp can
> >> revert the snapshot created on NetApp storage easily, and efficiently.
> >>    The benefit of this whole process, as Chris pointed out, if the
> >> snapshot chain is quite long, hypervisor based VM snapshot will get
> >> performance hit.
> >>
> >> >
> >> > 2) Additionally you want to be able to backup multiple disks at once,
> >> > regardless of VM snapshot.  Why don't we add the ability to put
> >> volumeIds in
> >> > snapshot cmd that if the storage provider supports it will get a
> batch of
> >> > volumeIds.
> >> >
> >> > Now I know we talked about 2 and there was some concerns about it
> (mostly
> >> > from me), but I think we could work through those concerns (forgot
> what
> >> > they were...).  Right now I just get the feeling we are shoehorning
> some
> >> > functionality into VM snapshot that isn't quite the right fit.  The
> "no
> >> quiesce"
> >> > flow just doesn't seem to make sense to me.
> >>
> >>
> >> Not sure above NetApp proposed work flow makes sense to you or to other
> >> body or not. If this work flow is only specific to NetApp, then we don't
> >> need to enforce the whole process for everybody.
> >>
> >> >
> >> > Darren
> >> >
> >> > On Tue, Oct 8, 2013 at 2:05 PM, SuichII, Christopher
> >> > <chris.su...@netapp.com> wrote:
> >> > > Whether the hypervisor snapshot happens depends on whether the
> >> > 'quiesce' option is specified with the snapshot request. If a user
> >> doesn't care
> >> > about the consistency of their backup, then the hypervisor
> >> snapshot/quiesce
> >> > step can be skipped altogether. This of course is not the case if the
> >> default
> >> > provider is being used, in which case a hypervisor snapshot is the
> only
> >> way of
> >> > creating a backup since it can't be offloaded to the storage driver.
> >> > >
> >> > > --
> >> > > Chris Suich
> >> > > chris.su...@netapp.com
> >> > > NetApp Software Engineer
> >> > > Data Center Platforms - Cloud Solutions Citrix, Cisco & Red Hat
> >> > >
> >> > > On Oct 8, 2013, at 4:57 PM, Darren Shepherd
> >> > > <darren.s.sheph...@gmail.com>
> >> > >  wrote:
> >> > >
> >> > >> Who is going to decide whether the hypervisor snapshot should
> >> > >> actually happen or not? Or how?
> >> > >>
> >> > >> Darren
> >> > >>
> >> > >> On Tue, Oct 8, 2013 at 12:38 PM, SuichII, Christopher
> >> > >> <chris.su...@netapp.com> wrote:
> >> > >>>
> >> > >>> --
> >> > >>> Chris Suich
> >> > >>> chris.su...@netapp.com
> >> > >>> NetApp Software Engineer
> >> > >>> Data Center Platforms - Cloud Solutions Citrix, Cisco & Red Hat
> >> > >>>
> >> > >>> On Oct 8, 2013, at 2:24 PM, Darren Shepherd
> >> > <darren.s.sheph...@gmail.com> wrote:
> >> > >>>
> >> > >>>> So in the implementation, when we say "quiesce" is that actually
> >> > >>>> being implemented as a VM snapshot (memory and disk).  And then
> >> > >>>> when you say "unquiesce" you are talking about deleting the VM
> >> > snapshot?
> >> > >>>
> >> > >>> If the VM snapshot is not going to the hypervisor, then yes, it
> will
> >> > actually be a hypervisor snapshot. Just to be clear, the unquiesce is
> >> not quite
> >> > a delete - it is a collapse of the VM snapshot and the active VM back
> >> into one
> >> > file.
> >> > >>>
> >> > >>>>
> >> > >>>> In NetApp, what are you snapshotting?  The whole netapp volume (I
> >> > >>>> don't know the correct term), a file on NFS, an iscsi volume?  I
> >> > >>>> don't know a whole heck of a lot about the netapp snapshot
> >> > capabilities.
> >> > >>>
> >> > >>> Essentially we are using internal APIs to create file level
> backups
> >> - don't
> >> > worry too much about the terminology.
> >> > >>>
> >> > >>>>
> >> > >>>> I know storage solutions can snapshot better and faster than
> >> > >>>> hypervisors can with COW files.  I've personally just been always
> >> > >>>> perplexed on whats the best way to implement it.  For storage
> >> > >>>> solutions that are block based, its really easy to have the
> storage
> >> > >>>> doing the snapshot.  For shared file systems, like NFS, its seems
> >> > >>>> way more complicated as you don't want to snapshot the entire
> >> > >>>> filesystem in order to snapshot one file.
> >> > >>>
> >> > >>> With filesystems like NFS, things are certainly more complicated,
> >> but that
> >> > is taken care of by our controller's operating system, Data ONTAP,
> and we
> >> > simply use APIs to communicate with it.
> >> > >>>
> >> > >>>>
> >> > >>>> Darren
> >> > >>>>
> >> > >>>> On Tue, Oct 8, 2013 at 11:10 AM, SuichII, Christopher
> >> > >>>> <chris.su...@netapp.com> wrote:
> >> > >>>>> I can comment on the second half.
> >> > >>>>>
> >> > >>>>> Through storage operations, storage providers can create backups
> >> > much faster than hypervisors and over time, their snapshots are more
> >> > efficient than the snapshot chains that hypervisors create. It is true
> >> that a VM
> >> > snapshot taken at the storage level is slightly different as it would
> be
> >> psuedo-
> >> > quiesced, not have it's memory snapshotted. This is accomplished
> through
> >> > hypervisor snapshots:
> >> > >>>>>
> >> > >>>>> 1) VM snapshot request (lets say VM 'A'
> >> > >>>>> 2) Create hypervisor snapshot (optional) -VM 'A' is snapshotted,
> >> > >>>>> creating active VM 'A*'
> >> > >>>>> -All disk traffic now goes to VM 'A*' and A is a snapshot of
> 'A*'
> >> > >>>>> 3) Storage driver(s) take snapshots of each volume
> >> > >>>>> 4) Undo hypervisor snapshot (optional) -VM snapshot 'A' is
> rolled
> >> > >>>>> back into VM 'A*' so the hypervisor snapshot no longer exists
> >> > >>>>>
> >> > >>>>> Now, a couple notes:
> >> > >>>>> -The reason this is optional is that not all users necessarily
> >> care about
> >> > the memory or disk consistency of their VMs and would prefer faster
> >> > snapshots to consistency.
> >> > >>>>> -Preemptively, yes, we are actually taking hypervisor snapshots
> >> which
> >> > means there isn't actually a performance of taking storage snapshots
> when
> >> > quiescing the VM. However, the performance gain will come both during
> >> > restoring the VM and during normal operations as described above.
> >> > >>>>>
> >> > >>>>> Although you can think of it as a poor man's VM snapshot, I
> would
> >> > think of it more as a consistent multi-volume snapshot. Again, the
> >> difference
> >> > being that this snapshot was not truly quiesced like a hypervisor
> >> snapshot
> >> > would be.
> >> > >>>>>
> >> > >>>>> --
> >> > >>>>> Chris Suich
> >> > >>>>> chris.su...@netapp.com
> >> > >>>>> NetApp Software Engineer
> >> > >>>>> Data Center Platforms - Cloud Solutions Citrix, Cisco & Red Hat
> >> > >>>>>
> >> > >>>>> On Oct 8, 2013, at 1:47 PM, Darren Shepherd
> >> > <darren.s.sheph...@gmail.com> wrote:
> >> > >>>>>
> >> > >>>>>> My only comment is that having the return type as boolean and
> >> > >>>>>> using to that indicate quiesce behaviour seems obscure and will
> >> > >>>>>> probably lead to a problem later.  Your basically saying the
> >> > >>>>>> result of the takeVMSnapshot will only ever need to communicate
> >> > >>>>>> back whether unquiesce needs to happen.  Maybe some result
> >> > object
> >> > >>>>>> would be more extensible.
> >> > >>>>>>
> >> > >>>>>> Actually, I think I have more comments.  This seems a bit odd
> to
> >> me.
> >> > >>>>>> Why would a storage driver in ACS implement a VM snapshot
> >> > >>>>>> functionality?  VM snapshot is a really a hypervisor
> orchestrated
> >> > >>>>>> operation.  So it seems like were trying to implement a poor
> mans
> >> > >>>>>> VM snapshot.  Maybe if I understood what NetApp was trying to
> do
> >> > >>>>>> it would make more sense, but its all odd.  To do a proper VM
> >> > >>>>>> snapshot you need to snapshot memory and disk at the exact same
> >> > >>>>>> time.  How are we going to do that if ACS is orchestrating the
> VM
> >> > >>>>>> snapshot and delegating to storage providers.  Its not like you
> >> > >>>>>> are going to pause the VM.... or are you?
> >> > >>>>>>
> >> > >>>>>> Darren
> >> > >>>>>>
> >> > >>>>>> On Mon, Oct 7, 2013 at 11:59 AM, Edison Su <
> edison...@citrix.com>
> >> > wrote:
> >> > >>>>>>> I created a design document page at
> >> > https://cwiki.apache.org/confluence/display/CLOUDSTACK/Pluggable+VM+s
> >> > napshot+related+operations, feel free to add items on it.
> >> > >>>>>>> And a new branch "pluggable_vm_snapshot" is created.
> >> > >>>>>>>
> >> > >>>>>>>> -----Original Message-----
> >> > >>>>>>>> From: SuichII, Christopher [mailto:chris.su...@netapp.com]
> >> > >>>>>>>> Sent: Monday, October 07, 2013 10:02 AM
> >> > >>>>>>>> To: <dev@cloudstack.apache.org>
> >> > >>>>>>>> Subject: Re: [DISCUSS] Pluggable VM snapshot related
> operations?
> >> > >>>>>>>>
> >> > >>>>>>>> I'm a fan of option 2 - this gives us the most flexibility
> (as
> >> > >>>>>>>> you stated). The option is given to completely override the
> way
> >> > >>>>>>>> VM snapshots work AND storage providers are given to
> >> > >>>>>>>> opportunity to work within the default VM snapshot workflow.
> >> > >>>>>>>>
> >> > >>>>>>>> I believe this option should satisfy your concern, Mike. The
> >> > >>>>>>>> snapshot and quiesce strategy would be in charge of
> >> > communicating with the hypervisor.
> >> > >>>>>>>> Storage providers should be able to leverage the default
> >> > >>>>>>>> strategies and simply perform the storage operations.
> >> > >>>>>>>>
> >> > >>>>>>>> I don't think it should be much of an issue that new method
> to
> >> > >>>>>>>> the storage driver interface may not apply to everyone. In
> fact,
> >> > that is already the case.
> >> > >>>>>>>> Some methods such as un/maintain(), attachToXXX() and
> >> > >>>>>>>> takeSnapshot() are already not implemented by every driver -
> >> > >>>>>>>> they just return false when asked if they can handle the
> >> operation.
> >> > >>>>>>>>
> >> > >>>>>>>> --
> >> > >>>>>>>> Chris Suich
> >> > >>>>>>>> chris.su...@netapp.com
> >> > >>>>>>>> NetApp Software Engineer
> >> > >>>>>>>> Data Center Platforms - Cloud Solutions Citrix, Cisco & Red
> Hat
> >> > >>>>>>>>
> >> > >>>>>>>> On Oct 5, 2013, at 12:11 AM, Mike Tutkowski
> >> > >>>>>>>> <mike.tutkow...@solidfire.com>
> >> > >>>>>>>> wrote:
> >> > >>>>>>>>
> >> > >>>>>>>>> Well, my first thought on this is that the storage driver
> >> > >>>>>>>>> should not be telling the hypervisor to do anything. It
> should
> >> > >>>>>>>>> be responsible for creating/deleting volumes, snapshots,
> etc.
> >> on
> >> > its storage system only.
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>> On Fri, Oct 4, 2013 at 5:57 PM, Edison Su <
> >> edison...@citrix.com>
> >> > wrote:
> >> > >>>>>>>>>
> >> > >>>>>>>>>> In 4.2, we added VM snapshot for Vmware/Xenserver. The
> >> > >>>>>>>>>> current workflow will be like the following:
> >> > >>>>>>>>>> createVMSnapshot api -> VMSnapshotManagerImpl:
> >> > >>>>>>>>>> creatVMSnapshot -> send CreateVMSnapshotCommand to
> >> > hypervisor to create vm snapshot.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> If anybody wants to change the workflow, then need to
> either
> >> > >>>>>>>>>> change VMSnapshotManagerImpl directly or subclass
> >> > VMSnapshotManagerImpl.
> >> > >>>>>>>>>> Both are not the ideal choice, as VMSnapshotManagerImpl
> >> > >>>>>>>>>> should be able to handle different ways to take vm
> snapshot,
> >> > instead of hard code.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> The requirements for the pluggable VM snapshot coming from:
> >> > >>>>>>>>>> Storage vendor may have their optimization, such as NetApp.
> >> > >>>>>>>>>> VM snapshot can be implemented in a totally different
> way(For
> >> > >>>>>>>>>> example, I could just send a command to guest VM, to tell
> my
> >> > >>>>>>>>>> application to flush disk and hold disk write, then come to
> >> > >>>>>>>>>> hypervisor to
> >> > >>>>>>>> take a volume snapshot).
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> If we agree on enable pluggable VM snapshot, then we can
> >> > move
> >> > >>>>>>>>>> on discuss how to implement it.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> The possible options:
> >> > >>>>>>>>>> 1. coarse grained interface. Add a VMSnapshotStrategy
> >> > >>>>>>>>>> interface, which has the following interfaces:
> >> > >>>>>>>>>> VMSnapshot takeVMSnapshot(VMSnapshot vmSnapshot);
> >> > Boolean
> >> > >>>>>>>>>> revertVMSnapshot(VMSnapshot vmSnapshot); Boolean
> >> > >>>>>>>>>> DeleteVMSnapshot(VMSnapshot vmSnapshot);
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> The work flow will be: createVMSnapshot api ->
> >> > >>>>>>>> VMSnapshotManagerImpl:
> >> > >>>>>>>>>> creatVMSnapshot -> VMSnapshotStrategy: takeVMSnapshot
> >> > >>>>>>>>>> VMSnapshotManagerImpl will manage VM state, do the sanity
> >> > >>>>>>>>>> check, then will handle over to VMSnapshotStrategy.
> >> > >>>>>>>>>> In VMSnapshotStrategy implementation, it may just send a
> >> > >>>>>>>>>> Create/revert/delete VMSnapshotCommand to hypervisor
> >> > host, or
> >> > >>>>>>>>>> do anything special operations.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> 2. fine-grained interface. Not only add a
> VMSnapshotStrategy
> >> > >>>>>>>>>> interface, but also add certain methods on the storage
> driver.
> >> > >>>>>>>>>> The VMSnapshotStrategy interface will be the same as
> option 1.
> >> > >>>>>>>>>> Will add the following methods on storage driver:
> >> > >>>>>>>>>> /* volumesBelongToVM  is the list of volumes of the VM that
> >> > >>>>>>>>>> created on this storage, storage vendor can either take one
> >> > >>>>>>>>>> snapshot for this volumes in one shot, or take snapshot for
> >> > each volume separately
> >> > >>>>>>>>>>    The pre-condition: vm is unquiesced.
> >> > >>>>>>>>>>    It will return a Boolean to indicate, do need unquiesce
> vm
> >> or
> >> > not.
> >> > >>>>>>>>>>    In the default storage driver, it will return false.
> >> > >>>>>>>>>> */
> >> > >>>>>>>>>> boolean takeVMSnapshot(List<VolumeInfo>
> >> > volumesBelongToVM,
> >> > >>>>>>>>>> VMSnapshot vmSnapshot); Boolean
> >> > >>>>>>>>>> revertVMSnapshot(List<VolumeInfo> volumesBelongToVM,
> >> > >>>>>>>>>> VMSnapshot vmSnapshot); Boolean
> >> > >>>>>>>>>> deleteVMSnapshot(List<VolumeInfo> volumesBelongToVM,
> >> > >>>>>>>>>> VMSnapshot vmSNapshot);
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> The work flow will be: createVMSnapshot api ->
> >> > >>>>>>>> VMSnapshotManagerImpl:
> >> > >>>>>>>>>> creatVMSnapshot -> VMSnapshotStrategy: takeVMSnapshot ->
> >> > >>>>>>>>>> storage driver:takeVMSnapshot In the implementation of
> >> > >>>>>>>>>> VMSnapshotStrategy's takeVMSnapshot, the pseudo code
> >> > looks like:
> >> > >>>>>>>>>>    HypervisorHelper.quiesceVM(vm);
> >> > >>>>>>>>>>    val volumes = vm.getVolumes();
> >> > >>>>>>>>>>    val maps = new Map[driver, list[VolumeInfo]]();
> >> > >>>>>>>>>>    Volumes.foreach(volume => maps.put(volume.getDriver,
> >> > volume ::
> >> > >>>>>>>>>> maps.get(volume.getdriver())))
> >> > >>>>>>>>>>    val needUnquiesce = true;
> >> > >>>>>>>>>>     maps.foreach((driver, volumes) => needUnquiesce  =
> >> > >>>>>>>>>> needUnquiesce && driver.takeVMSnapshot(volumes))
> >> > >>>>>>>>>>   if (needUnquiesce ) {
> >> > >>>>>>>>>>    HypervisorHelper.unquiesce(vm); }
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> By default, the quiesceVM in HypervisorHelper will actually
> >> > >>>>>>>>>> take vm snapshot through hypervisor.
> >> > >>>>>>>>>> Does above logic makes senesce?
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> The pros of option 1 is that: it's simple, no need to
> change
> >> > >>>>>>>>>> storage driver interfaces. The cons is that each storage
> >> > >>>>>>>>>> vendor need to implement a strategy, maybe they will do the
> >> > same thing.
> >> > >>>>>>>>>> The pros of option 2 is that, storage driver won't need to
> >> > >>>>>>>>>> worry about how to quiesce/unquiesce vm. The cons is that,
> it
> >> > >>>>>>>>>> will add these methods on each storage drivers, so it
> assumes
> >> > >>>>>>>>>> that this work flow will work for everybody.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> So which option we should take? Or if you have other
> options,
> >> > >>>>>>>>>> please let's know.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>>
> >> > >>>>>>>>>>
> >> > >>>>>>>>>>
> >> > >>>>>>>>>>
> >> > >>>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>> --
> >> > >>>>>>>>> *Mike Tutkowski*
> >> > >>>>>>>>> *Senior CloudStack Developer, SolidFire Inc.*
> >> > >>>>>>>>> e: mike.tutkow...@solidfire.com
> >> > >>>>>>>>> o: 303.746.7302
> >> > >>>>>>>>> Advancing the way the world uses the
> >> > >>>>>>>>> cloud<http://solidfire.com/solution/overview/?video=play>
> >> > >>>>>>>>> *(tm)*
> >> > >>>>>>>
> >> > >>>>>
> >> > >>>
> >> > >
> >>
> >
> >
> >
> > --
> > *Mike Tutkowski*
> > *Senior CloudStack Developer, SolidFire Inc.*
> > e: mike.tutkow...@solidfire.com
> > o: 303.746.7302
> > Advancing the way the world uses the
> > cloud<http://solidfire.com/solution/overview/?video=play>
> > *™*
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkow...@solidfire.com
o: 303.746.7302
Advancing the way the world uses the
cloud<http://solidfire.com/solution/overview/?video=play>
*™*

Reply via email to