Hey Ian,

Since it looks like the intent of this particular thread is just to discuss
RBD and snapshots (which I don't think your business uses), you would be
more interested in the "Query on snapshot and cloning for managed storage"
thread as that one talks about this issue at a more general level.

Talk to you later!
Mike

On Mon, Feb 16, 2015 at 10:42 AM, Mike Tutkowski <
mike.tutkow...@solidfire.com> wrote:

> For example, Punith from CloudByte sent out an e-mail yesterday that was
> very similar to this thread, but he was wondering how to implement such a
> concept on his company's SAN technology.
>
> On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> mike.tutkow...@solidfire.com> wrote:
>
>> Yeah, I think it's a similar concept, though.
>>
>> You would want to take snapshots on Ceph (or some other backend system
>> that acts as primary storage) instead of copying data to secondary storage
>> and calling it a snapshot.
>>
>> For Ceph or any other backend system like that, the idea is to speed up
>> snapshots by not requiring CPU cycles on the front end or network bandwidth
>> to transfer the data.
>>
>> In that sense, this is a general-purpose CloudStack problem and it
>> appears you are intending on discussing only the Ceph implementation here,
>> which is fine.
>>
>> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <lbarfi...@tqhosting.com
>> > wrote:
>>
>>> Hi Mike,
>>>
>>> I think the interest in this issue is primarily for Ceph RBD, which
>>> doesn't use iSCSI or SAN concepts in general.  As well I believe RBD
>>> is only currently supported in KVM (and VMware?).  QEMU has native RBD
>>> support, so it attaches the devices directly to the VMs in question.
>>> It also natively supports snapshotting, which is what this discussion
>>> is about.
>>>
>>> Thank You,
>>>
>>> Logan Barfield
>>> Tranquil Hosting
>>>
>>>
>>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
>>> <mike.tutkow...@solidfire.com> wrote:
>>> > I should have also commented on KVM (since that was the hypervisor
>>> called
>>> > out in the initial e-mail).
>>> >
>>> > In my situation, most of my customers use XenServer and/or ESXi, so
>>> KVM has
>>> > received the fewest of my cycles with regards to those three
>>> hypervisors.
>>> >
>>> > KVM, though, is actually the simplest hypervisor for which to implement
>>> > these changes (since I am using the iSCSI adapter of the KVM agent and
>>> it
>>> > just essentially passes my LUN to the VM in question).
>>> >
>>> > For KVM, there is no clustered file system applied to my backend LUN,
>>> so I
>>> > don't have to "worry" about that layer.
>>> >
>>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such
>>> is the
>>> > case with XenServer) or having to re-signature anything (such is the
>>> case
>>> > with ESXi).
>>> >
>>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
>>> > mike.tutkow...@solidfire.com> wrote:
>>> >
>>> >> I have been working on this on and off for a while now (as time
>>> permits).
>>> >>
>>> >> Here is an e-mail I sent to a customer of ours that helps describe
>>> some of
>>> >> the issues:
>>> >>
>>> >> *** Beginning of e-mail ***
>>> >>
>>> >> The main requests were around the following features:
>>> >>
>>> >> * The ability to leverage SolidFire snapshots.
>>> >>
>>> >> * The ability to create CloudStack templates from SolidFire snapshots.
>>> >>
>>> >> I had these on my roadmap, but bumped the priority up and began work
>>> on
>>> >> them for the CS 4.6 release.
>>> >>
>>> >> During design, I realized there were issues with the way XenServer is
>>> >> architected that prevented me from directly using SolidFire snapshots.
>>> >>
>>> >> I could definitely take a SolidFire snapshot of a SolidFire volume,
>>> but
>>> >> this snapshot would not be usable from XenServer if the original
>>> volume was
>>> >> still in use.
>>> >>
>>> >> Here is the gist of the problem:
>>> >>
>>> >> When XenServer leverages an iSCSI target such as a SolidFire volume,
>>> it
>>> >> applies a clustered files system to it, which they call a storage
>>> >> repository (SR). An SR has an *immutable* UUID associated with it.
>>> >>
>>> >> The virtual volume (which a VM sees as a disk) is represented by a
>>> virtual
>>> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID
>>> associated
>>> >> with it.
>>> >>
>>> >> If I take a snapshot (or a clone) of the SolidFire volume and then
>>> later
>>> >> try to use that snapshot from XenServer, XenServer complains that the
>>> SR on
>>> >> the snapshot has a UUID that conflicts with an existing UUID.
>>> >>
>>> >> In other words, it is not possible to use the original SR and the
>>> snapshot
>>> >> of this SR from XenServer at the same time, which is critical in a
>>> cloud
>>> >> environment (to enable creating templates from snapshots).
>>> >>
>>> >> The way I have proposed circumventing this issue is not ideal, but
>>> >> technically works (this code is checked into the CS 4.6 branch):
>>> >>
>>> >> When the time comes to take a CloudStack snapshot of a CloudStack
>>> volume
>>> >> that is backed by SolidFire storage via the storage plug-in, the
>>> plug-in
>>> >> will create a new SolidFire volume with characteristics (size and
>>> IOPS)
>>> >> equal to those of the original volume.
>>> >>
>>> >> We then have XenServer attach to this new SolidFire volume, create a
>>> *new*
>>> >> SR on it, and then copy the VDI from the source SR to the destination
>>> SR
>>> >> (the new SR).
>>> >>
>>> >> This leads to us having a copy of the VDI (a "snapshot" of sorts),
>>> but it
>>> >> requires CPU cycles on the compute cluster as well as network
>>> bandwidth to
>>> >> write to the SAN (thus it is slower and more resource intensive than a
>>> >> SolidFire snapshot).
>>> >>
>>> >> I spoke with Tim Mackey (who works on XenServer at Citrix) concerning
>>> this
>>> >> issue before and during the CloudStack Collaboration Conference in
>>> Budapest
>>> >> in November. He agreed that this is a legitimate issue with the way
>>> >> XenServer is designed and could not think of a way (other than what I
>>> was
>>> >> doing) to get around it in current versions of XenServer.
>>> >>
>>> >> One thought is to have a feature added to XenServer that enables you
>>> to
>>> >> change the UUID of an SR and of a VDI.
>>> >>
>>> >> If I could do that, then I could take a SolidFire snapshot of the
>>> >> SolidFire volume and issue commands to XenServer to have it change the
>>> >> UUIDs of the original SR and the original VDI. I could then recored
>>> the
>>> >> necessary UUID info in the CS DB.
>>> >>
>>> >> *** End of e-mail ***
>>> >>
>>> >> I have since investigated this on ESXi.
>>> >>
>>> >> ESXi does have a way for us to "re-signature" a datastore, so backend
>>> >> snapshots can be taken and effectively used on this hypervisor.
>>> >>
>>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
>>> lbarfi...@tqhosting.com>
>>> >> wrote:
>>> >>
>>> >>> I'm just going to stick with the qemu-img option change for RBD for
>>> >>> now (which should cut snapshot time down drastically), and look
>>> >>> forward to this in the future.  I'd be happy to help get this moving,
>>> >>> but I'm not enough of a developer to lead the charge.
>>> >>>
>>> >>> As far as renaming goes, I agree that maybe backups isn't the right
>>> >>> word.  That being said calling a full-sized copy of a volume a
>>> >>> "snapshot" also isn't the right word.  Maybe "image" would be better?
>>> >>>
>>> >>> I've also got my reservations about "accounts" vs "users" (I think
>>> >>> "departments" and "accounts or users" respectively is less
>>> confusing),
>>> >>> but that's a different thread.
>>> >>>
>>> >>> Thank You,
>>> >>>
>>> >>> Logan Barfield
>>> >>> Tranquil Hosting
>>> >>>
>>> >>>
>>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <w...@widodh.nl
>>> >
>>> >>> wrote:
>>> >>> >
>>> >>> >
>>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
>>> >>> >> I like this idea a lot for Ceph RBD.  I do think there should
>>> still be
>>> >>> >> support for copying snapshots to secondary storage as needed (for
>>> >>> >> transfers between zones, etc.).  I really think that this could be
>>> >>> >> part of a larger move to clarify the naming conventions used for
>>> disk
>>> >>> >> operations.  Currently "Volume Snapshots" should probably really
>>> be
>>> >>> >> called "Backups".  So having "snapshot" functionality, and a
>>> "convert
>>> >>> >> snapshot to backup/template" would be a good move.
>>> >>> >>
>>> >>> >
>>> >>> > I fully agree that this would be a very great addition.
>>> >>> >
>>> >>> > I won't be able to work on this any time soon though.
>>> >>> >
>>> >>> > Wido
>>> >>> >
>>> >>> >> Thank You,
>>> >>> >>
>>> >>> >> Logan Barfield
>>> >>> >> Tranquil Hosting
>>> >>> >>
>>> >>> >>
>>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
>>> >>> andrija.pa...@gmail.com> wrote:
>>> >>> >>> BIG +1
>>> >>> >>>
>>> >>> >>> My team should submit some patch to ACS for better KVM snapshots,
>>> >>> including
>>> >>> >>> whole VM snapshot etc...but it's too early to give details...
>>> >>> >>> best
>>> >>> >>>
>>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
>>> and...@arhont.com>
>>> >>> wrote:
>>> >>> >>>
>>> >>> >>>> Hello guys,
>>> >>> >>>>
>>> >>> >>>> I was hoping to have some feedback from the community on the
>>> subject
>>> >>> of
>>> >>> >>>> having an ability to keep snapshots on the primary storage
>>> where it
>>> >>> is
>>> >>> >>>> supported by the storage backend.
>>> >>> >>>>
>>> >>> >>>> The idea behind this functionality is to improve how snapshots
>>> are
>>> >>> >>>> currently handled on KVM hypervisors with Ceph primary storage.
>>> At
>>> >>> the
>>> >>> >>>> moment, the snapshots are taken on the primary storage and being
>>> >>> copied to
>>> >>> >>>> the secondary storage. This method is very slow and inefficient
>>> even
>>> >>> on
>>> >>> >>>> small infrastructure. Even on medium deployments using
>>> snapshots in
>>> >>> KVM
>>> >>> >>>> becomes nearly impossible. If you have tens or hundreds
>>> concurrent
>>> >>> >>>> snapshots taking place you will have a bunch of timeouts and
>>> errors,
>>> >>> your
>>> >>> >>>> network becomes clogged, etc. In addition, using these
>>> snapshots for
>>> >>> >>>> creating new volumes or reverting back vms also slow and
>>> >>> inefficient. As
>>> >>> >>>> above, when you have tens or hundreds concurrent operations it
>>> will
>>> >>> not
>>> >>> >>>> succeed and you will have a majority of tasks with errors or
>>> >>> timeouts.
>>> >>> >>>>
>>> >>> >>>> At the moment, taking a single snapshot of relatively small
>>> volumes
>>> >>> (200GB
>>> >>> >>>> or 500GB for instance) takes tens if not hundreds of minutes.
>>> Taking
>>> >>> a
>>> >>> >>>> snapshot of the same volume on ceph primary storage takes a few
>>> >>> seconds at
>>> >>> >>>> most! Similarly, converting a snapshot to a volume takes tens
>>> if not
>>> >>> >>>> hundreds of minutes when secondary storage is involved;
>>> compared with
>>> >>> >>>> seconds if done directly on the primary storage.
>>> >>> >>>>
>>> >>> >>>> I suggest that the CloudStack should have the ability to keep
>>> volume
>>> >>> >>>> snapshots on the primary storage where this is supported by the
>>> >>> storage.
>>> >>> >>>> Perhaps having a per primary storage setting that enables this
>>> >>> >>>> functionality. This will be beneficial for Ceph primary storage
>>> on
>>> >>> KVM
>>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be
>>> supported in
>>> >>> a near
>>> >>> >>>> future.
>>> >>> >>>>
>>> >>> >>>> This will greatly speed up the process of using snapshots on
>>> KVM and
>>> >>> users
>>> >>> >>>> will actually start using snapshotting rather than giving up
>>> with
>>> >>> >>>> frustration.
>>> >>> >>>>
>>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your
>>> vote
>>> >>> if you
>>> >>> >>>> are in agreement.
>>> >>> >>>>
>>> >>> >>>> Thanks for your input
>>> >>> >>>>
>>> >>> >>>> Andrei
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>
>>> >>> >>>
>>> >>> >>> --
>>> >>> >>>
>>> >>> >>> Andrija Panić
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> *Mike Tutkowski*
>>> >> *Senior CloudStack Developer, SolidFire Inc.*
>>> >> e: mike.tutkow...@solidfire.com
>>> >> o: 303.746.7302
>>> >> Advancing the way the world uses the cloud
>>> >> <http://solidfire.com/solution/overview/?video=play>*™*
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > *Mike Tutkowski*
>>> > *Senior CloudStack Developer, SolidFire Inc.*
>>> > e: mike.tutkow...@solidfire.com
>>> > o: 303.746.7302
>>> > Advancing the way the world uses the cloud
>>> > <http://solidfire.com/solution/overview/?video=play>*™*
>>>
>>
>>
>>
>> --
>> *Mike Tutkowski*
>> *Senior CloudStack Developer, SolidFire Inc.*
>> e: mike.tutkow...@solidfire.com
>> o: 303.746.7302
>> Advancing the way the world uses the cloud
>> <http://solidfire.com/solution/overview/?video=play>*™*
>>
>
>
>
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkow...@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <http://solidfire.com/solution/overview/?video=play>*™*
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkow...@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Reply via email to