Hey Ian, Since it looks like the intent of this particular thread is just to discuss RBD and snapshots (which I don't think your business uses), you would be more interested in the "Query on snapshot and cloning for managed storage" thread as that one talks about this issue at a more general level.
Talk to you later! Mike On Mon, Feb 16, 2015 at 10:42 AM, Mike Tutkowski < mike.tutkow...@solidfire.com> wrote: > For example, Punith from CloudByte sent out an e-mail yesterday that was > very similar to this thread, but he was wondering how to implement such a > concept on his company's SAN technology. > > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski < > mike.tutkow...@solidfire.com> wrote: > >> Yeah, I think it's a similar concept, though. >> >> You would want to take snapshots on Ceph (or some other backend system >> that acts as primary storage) instead of copying data to secondary storage >> and calling it a snapshot. >> >> For Ceph or any other backend system like that, the idea is to speed up >> snapshots by not requiring CPU cycles on the front end or network bandwidth >> to transfer the data. >> >> In that sense, this is a general-purpose CloudStack problem and it >> appears you are intending on discussing only the Ceph implementation here, >> which is fine. >> >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <lbarfi...@tqhosting.com >> > wrote: >> >>> Hi Mike, >>> >>> I think the interest in this issue is primarily for Ceph RBD, which >>> doesn't use iSCSI or SAN concepts in general. As well I believe RBD >>> is only currently supported in KVM (and VMware?). QEMU has native RBD >>> support, so it attaches the devices directly to the VMs in question. >>> It also natively supports snapshotting, which is what this discussion >>> is about. >>> >>> Thank You, >>> >>> Logan Barfield >>> Tranquil Hosting >>> >>> >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski >>> <mike.tutkow...@solidfire.com> wrote: >>> > I should have also commented on KVM (since that was the hypervisor >>> called >>> > out in the initial e-mail). >>> > >>> > In my situation, most of my customers use XenServer and/or ESXi, so >>> KVM has >>> > received the fewest of my cycles with regards to those three >>> hypervisors. >>> > >>> > KVM, though, is actually the simplest hypervisor for which to implement >>> > these changes (since I am using the iSCSI adapter of the KVM agent and >>> it >>> > just essentially passes my LUN to the VM in question). >>> > >>> > For KVM, there is no clustered file system applied to my backend LUN, >>> so I >>> > don't have to "worry" about that layer. >>> > >>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such >>> is the >>> > case with XenServer) or having to re-signature anything (such is the >>> case >>> > with ESXi). >>> > >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski < >>> > mike.tutkow...@solidfire.com> wrote: >>> > >>> >> I have been working on this on and off for a while now (as time >>> permits). >>> >> >>> >> Here is an e-mail I sent to a customer of ours that helps describe >>> some of >>> >> the issues: >>> >> >>> >> *** Beginning of e-mail *** >>> >> >>> >> The main requests were around the following features: >>> >> >>> >> * The ability to leverage SolidFire snapshots. >>> >> >>> >> * The ability to create CloudStack templates from SolidFire snapshots. >>> >> >>> >> I had these on my roadmap, but bumped the priority up and began work >>> on >>> >> them for the CS 4.6 release. >>> >> >>> >> During design, I realized there were issues with the way XenServer is >>> >> architected that prevented me from directly using SolidFire snapshots. >>> >> >>> >> I could definitely take a SolidFire snapshot of a SolidFire volume, >>> but >>> >> this snapshot would not be usable from XenServer if the original >>> volume was >>> >> still in use. >>> >> >>> >> Here is the gist of the problem: >>> >> >>> >> When XenServer leverages an iSCSI target such as a SolidFire volume, >>> it >>> >> applies a clustered files system to it, which they call a storage >>> >> repository (SR). An SR has an *immutable* UUID associated with it. >>> >> >>> >> The virtual volume (which a VM sees as a disk) is represented by a >>> virtual >>> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID >>> associated >>> >> with it. >>> >> >>> >> If I take a snapshot (or a clone) of the SolidFire volume and then >>> later >>> >> try to use that snapshot from XenServer, XenServer complains that the >>> SR on >>> >> the snapshot has a UUID that conflicts with an existing UUID. >>> >> >>> >> In other words, it is not possible to use the original SR and the >>> snapshot >>> >> of this SR from XenServer at the same time, which is critical in a >>> cloud >>> >> environment (to enable creating templates from snapshots). >>> >> >>> >> The way I have proposed circumventing this issue is not ideal, but >>> >> technically works (this code is checked into the CS 4.6 branch): >>> >> >>> >> When the time comes to take a CloudStack snapshot of a CloudStack >>> volume >>> >> that is backed by SolidFire storage via the storage plug-in, the >>> plug-in >>> >> will create a new SolidFire volume with characteristics (size and >>> IOPS) >>> >> equal to those of the original volume. >>> >> >>> >> We then have XenServer attach to this new SolidFire volume, create a >>> *new* >>> >> SR on it, and then copy the VDI from the source SR to the destination >>> SR >>> >> (the new SR). >>> >> >>> >> This leads to us having a copy of the VDI (a "snapshot" of sorts), >>> but it >>> >> requires CPU cycles on the compute cluster as well as network >>> bandwidth to >>> >> write to the SAN (thus it is slower and more resource intensive than a >>> >> SolidFire snapshot). >>> >> >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix) concerning >>> this >>> >> issue before and during the CloudStack Collaboration Conference in >>> Budapest >>> >> in November. He agreed that this is a legitimate issue with the way >>> >> XenServer is designed and could not think of a way (other than what I >>> was >>> >> doing) to get around it in current versions of XenServer. >>> >> >>> >> One thought is to have a feature added to XenServer that enables you >>> to >>> >> change the UUID of an SR and of a VDI. >>> >> >>> >> If I could do that, then I could take a SolidFire snapshot of the >>> >> SolidFire volume and issue commands to XenServer to have it change the >>> >> UUIDs of the original SR and the original VDI. I could then recored >>> the >>> >> necessary UUID info in the CS DB. >>> >> >>> >> *** End of e-mail *** >>> >> >>> >> I have since investigated this on ESXi. >>> >> >>> >> ESXi does have a way for us to "re-signature" a datastore, so backend >>> >> snapshots can be taken and effectively used on this hypervisor. >>> >> >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield < >>> lbarfi...@tqhosting.com> >>> >> wrote: >>> >> >>> >>> I'm just going to stick with the qemu-img option change for RBD for >>> >>> now (which should cut snapshot time down drastically), and look >>> >>> forward to this in the future. I'd be happy to help get this moving, >>> >>> but I'm not enough of a developer to lead the charge. >>> >>> >>> >>> As far as renaming goes, I agree that maybe backups isn't the right >>> >>> word. That being said calling a full-sized copy of a volume a >>> >>> "snapshot" also isn't the right word. Maybe "image" would be better? >>> >>> >>> >>> I've also got my reservations about "accounts" vs "users" (I think >>> >>> "departments" and "accounts or users" respectively is less >>> confusing), >>> >>> but that's a different thread. >>> >>> >>> >>> Thank You, >>> >>> >>> >>> Logan Barfield >>> >>> Tranquil Hosting >>> >>> >>> >>> >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <w...@widodh.nl >>> > >>> >>> wrote: >>> >>> > >>> >>> > >>> >>> > On 16-02-15 15:38, Logan Barfield wrote: >>> >>> >> I like this idea a lot for Ceph RBD. I do think there should >>> still be >>> >>> >> support for copying snapshots to secondary storage as needed (for >>> >>> >> transfers between zones, etc.). I really think that this could be >>> >>> >> part of a larger move to clarify the naming conventions used for >>> disk >>> >>> >> operations. Currently "Volume Snapshots" should probably really >>> be >>> >>> >> called "Backups". So having "snapshot" functionality, and a >>> "convert >>> >>> >> snapshot to backup/template" would be a good move. >>> >>> >> >>> >>> > >>> >>> > I fully agree that this would be a very great addition. >>> >>> > >>> >>> > I won't be able to work on this any time soon though. >>> >>> > >>> >>> > Wido >>> >>> > >>> >>> >> Thank You, >>> >>> >> >>> >>> >> Logan Barfield >>> >>> >> Tranquil Hosting >>> >>> >> >>> >>> >> >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic < >>> >>> andrija.pa...@gmail.com> wrote: >>> >>> >>> BIG +1 >>> >>> >>> >>> >>> >>> My team should submit some patch to ACS for better KVM snapshots, >>> >>> including >>> >>> >>> whole VM snapshot etc...but it's too early to give details... >>> >>> >>> best >>> >>> >>> >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky < >>> and...@arhont.com> >>> >>> wrote: >>> >>> >>> >>> >>> >>>> Hello guys, >>> >>> >>>> >>> >>> >>>> I was hoping to have some feedback from the community on the >>> subject >>> >>> of >>> >>> >>>> having an ability to keep snapshots on the primary storage >>> where it >>> >>> is >>> >>> >>>> supported by the storage backend. >>> >>> >>>> >>> >>> >>>> The idea behind this functionality is to improve how snapshots >>> are >>> >>> >>>> currently handled on KVM hypervisors with Ceph primary storage. >>> At >>> >>> the >>> >>> >>>> moment, the snapshots are taken on the primary storage and being >>> >>> copied to >>> >>> >>>> the secondary storage. This method is very slow and inefficient >>> even >>> >>> on >>> >>> >>>> small infrastructure. Even on medium deployments using >>> snapshots in >>> >>> KVM >>> >>> >>>> becomes nearly impossible. If you have tens or hundreds >>> concurrent >>> >>> >>>> snapshots taking place you will have a bunch of timeouts and >>> errors, >>> >>> your >>> >>> >>>> network becomes clogged, etc. In addition, using these >>> snapshots for >>> >>> >>>> creating new volumes or reverting back vms also slow and >>> >>> inefficient. As >>> >>> >>>> above, when you have tens or hundreds concurrent operations it >>> will >>> >>> not >>> >>> >>>> succeed and you will have a majority of tasks with errors or >>> >>> timeouts. >>> >>> >>>> >>> >>> >>>> At the moment, taking a single snapshot of relatively small >>> volumes >>> >>> (200GB >>> >>> >>>> or 500GB for instance) takes tens if not hundreds of minutes. >>> Taking >>> >>> a >>> >>> >>>> snapshot of the same volume on ceph primary storage takes a few >>> >>> seconds at >>> >>> >>>> most! Similarly, converting a snapshot to a volume takes tens >>> if not >>> >>> >>>> hundreds of minutes when secondary storage is involved; >>> compared with >>> >>> >>>> seconds if done directly on the primary storage. >>> >>> >>>> >>> >>> >>>> I suggest that the CloudStack should have the ability to keep >>> volume >>> >>> >>>> snapshots on the primary storage where this is supported by the >>> >>> storage. >>> >>> >>>> Perhaps having a per primary storage setting that enables this >>> >>> >>>> functionality. This will be beneficial for Ceph primary storage >>> on >>> >>> KVM >>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be >>> supported in >>> >>> a near >>> >>> >>>> future. >>> >>> >>>> >>> >>> >>>> This will greatly speed up the process of using snapshots on >>> KVM and >>> >>> users >>> >>> >>>> will actually start using snapshotting rather than giving up >>> with >>> >>> >>>> frustration. >>> >>> >>>> >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your >>> vote >>> >>> if you >>> >>> >>>> are in agreement. >>> >>> >>>> >>> >>> >>>> Thanks for your input >>> >>> >>>> >>> >>> >>>> Andrei >>> >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> >>> >>> >>> >>> Andrija Panić >>> >>> >>> >> >>> >> >>> >> >>> >> -- >>> >> *Mike Tutkowski* >>> >> *Senior CloudStack Developer, SolidFire Inc.* >>> >> e: mike.tutkow...@solidfire.com >>> >> o: 303.746.7302 >>> >> Advancing the way the world uses the cloud >>> >> <http://solidfire.com/solution/overview/?video=play>*™* >>> >> >>> > >>> > >>> > >>> > -- >>> > *Mike Tutkowski* >>> > *Senior CloudStack Developer, SolidFire Inc.* >>> > e: mike.tutkow...@solidfire.com >>> > o: 303.746.7302 >>> > Advancing the way the world uses the cloud >>> > <http://solidfire.com/solution/overview/?video=play>*™* >>> >> >> >> >> -- >> *Mike Tutkowski* >> *Senior CloudStack Developer, SolidFire Inc.* >> e: mike.tutkow...@solidfire.com >> o: 303.746.7302 >> Advancing the way the world uses the cloud >> <http://solidfire.com/solution/overview/?video=play>*™* >> > > > > -- > *Mike Tutkowski* > *Senior CloudStack Developer, SolidFire Inc.* > e: mike.tutkow...@solidfire.com > o: 303.746.7302 > Advancing the way the world uses the cloud > <http://solidfire.com/solution/overview/?video=play>*™* > -- *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: mike.tutkow...@solidfire.com o: 303.746.7302 Advancing the way the world uses the cloud <http://solidfire.com/solution/overview/?video=play>*™*