Well...count me in on the general-purpose part (I'm already working on that and have much of it working).
If someone is interested in implementing the RBD part, he/she can sync with me and see if there is any overlapping work that I've already implementing from a general-purpose standpoint. On Mon, Feb 16, 2015 at 1:39 PM, Ian Rae <i...@cloudops.com> wrote: > Agree with Logan. As fans of Ceph as well as SolidFire, we are interested > in seeing this particular use case (RBD/KVM) being well implemented, > however the concept of volume snapshots residing only on primary storage vs > being transferred to secondary storage is a more generally useful one that > is worth solving with the same terminology and interfaces, even if the > mechanisms may be specific to the storage type and hypervisor. > > It its not practical then its not practical, but seems like it would be > worth trying. > > On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield <lbarfi...@tqhosting.com> > wrote: > > > Hi Mike, > > > > I agree it is a general CloudStack issue that can be addressed across > > multiple primary storage options. It's a two stage issue since some > > changes will need to be implemented to support these features across > > the board, and others will need to be made to each storage option. > > > > It would be nice to see a single issue opened to cover this across all > > available storage options. Maybe have a community vote on what > > support they want to see, and not consider the feature complete until > > all of the desired options are implemented? That would slow down > > development for sure, but it would ensure that it was supported where > > it needs to be. > > > > Thank You, > > > > Logan Barfield > > Tranquil Hosting > > > > > > On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski > > <mike.tutkow...@solidfire.com> wrote: > > > For example, Punith from CloudByte sent out an e-mail yesterday that > was > > > very similar to this thread, but he was wondering how to implement > such a > > > concept on his company's SAN technology. > > > > > > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski < > > > mike.tutkow...@solidfire.com> wrote: > > > > > >> Yeah, I think it's a similar concept, though. > > >> > > >> You would want to take snapshots on Ceph (or some other backend system > > >> that acts as primary storage) instead of copying data to secondary > > storage > > >> and calling it a snapshot. > > >> > > >> For Ceph or any other backend system like that, the idea is to speed > up > > >> snapshots by not requiring CPU cycles on the front end or network > > bandwidth > > >> to transfer the data. > > >> > > >> In that sense, this is a general-purpose CloudStack problem and it > > appears > > >> you are intending on discussing only the Ceph implementation here, > > which is > > >> fine. > > >> > > >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield < > > lbarfi...@tqhosting.com> > > >> wrote: > > >> > > >>> Hi Mike, > > >>> > > >>> I think the interest in this issue is primarily for Ceph RBD, which > > >>> doesn't use iSCSI or SAN concepts in general. As well I believe RBD > > >>> is only currently supported in KVM (and VMware?). QEMU has native > RBD > > >>> support, so it attaches the devices directly to the VMs in question. > > >>> It also natively supports snapshotting, which is what this discussion > > >>> is about. > > >>> > > >>> Thank You, > > >>> > > >>> Logan Barfield > > >>> Tranquil Hosting > > >>> > > >>> > > >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski > > >>> <mike.tutkow...@solidfire.com> wrote: > > >>> > I should have also commented on KVM (since that was the hypervisor > > >>> called > > >>> > out in the initial e-mail). > > >>> > > > >>> > In my situation, most of my customers use XenServer and/or ESXi, so > > KVM > > >>> has > > >>> > received the fewest of my cycles with regards to those three > > >>> hypervisors. > > >>> > > > >>> > KVM, though, is actually the simplest hypervisor for which to > > implement > > >>> > these changes (since I am using the iSCSI adapter of the KVM agent > > and > > >>> it > > >>> > just essentially passes my LUN to the VM in question). > > >>> > > > >>> > For KVM, there is no clustered file system applied to my backend > LUN, > > >>> so I > > >>> > don't have to "worry" about that layer. > > >>> > > > >>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs > (such > > is > > >>> the > > >>> > case with XenServer) or having to re-signature anything (such is > the > > >>> case > > >>> > with ESXi). > > >>> > > > >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski < > > >>> > mike.tutkow...@solidfire.com> wrote: > > >>> > > > >>> >> I have been working on this on and off for a while now (as time > > >>> permits). > > >>> >> > > >>> >> Here is an e-mail I sent to a customer of ours that helps describe > > >>> some of > > >>> >> the issues: > > >>> >> > > >>> >> *** Beginning of e-mail *** > > >>> >> > > >>> >> The main requests were around the following features: > > >>> >> > > >>> >> * The ability to leverage SolidFire snapshots. > > >>> >> > > >>> >> * The ability to create CloudStack templates from SolidFire > > snapshots. > > >>> >> > > >>> >> I had these on my roadmap, but bumped the priority up and began > > work on > > >>> >> them for the CS 4.6 release. > > >>> >> > > >>> >> During design, I realized there were issues with the way XenServer > > is > > >>> >> architected that prevented me from directly using SolidFire > > snapshots. > > >>> >> > > >>> >> I could definitely take a SolidFire snapshot of a SolidFire > volume, > > but > > >>> >> this snapshot would not be usable from XenServer if the original > > >>> volume was > > >>> >> still in use. > > >>> >> > > >>> >> Here is the gist of the problem: > > >>> >> > > >>> >> When XenServer leverages an iSCSI target such as a SolidFire > > volume, it > > >>> >> applies a clustered files system to it, which they call a storage > > >>> >> repository (SR). An SR has an *immutable* UUID associated with it. > > >>> >> > > >>> >> The virtual volume (which a VM sees as a disk) is represented by a > > >>> virtual > > >>> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID > > >>> associated > > >>> >> with it. > > >>> >> > > >>> >> If I take a snapshot (or a clone) of the SolidFire volume and then > > >>> later > > >>> >> try to use that snapshot from XenServer, XenServer complains that > > the > > >>> SR on > > >>> >> the snapshot has a UUID that conflicts with an existing UUID. > > >>> >> > > >>> >> In other words, it is not possible to use the original SR and the > > >>> snapshot > > >>> >> of this SR from XenServer at the same time, which is critical in a > > >>> cloud > > >>> >> environment (to enable creating templates from snapshots). > > >>> >> > > >>> >> The way I have proposed circumventing this issue is not ideal, but > > >>> >> technically works (this code is checked into the CS 4.6 branch): > > >>> >> > > >>> >> When the time comes to take a CloudStack snapshot of a CloudStack > > >>> volume > > >>> >> that is backed by SolidFire storage via the storage plug-in, the > > >>> plug-in > > >>> >> will create a new SolidFire volume with characteristics (size and > > IOPS) > > >>> >> equal to those of the original volume. > > >>> >> > > >>> >> We then have XenServer attach to this new SolidFire volume, > create a > > >>> *new* > > >>> >> SR on it, and then copy the VDI from the source SR to the > > destination > > >>> SR > > >>> >> (the new SR). > > >>> >> > > >>> >> This leads to us having a copy of the VDI (a "snapshot" of sorts), > > but > > >>> it > > >>> >> requires CPU cycles on the compute cluster as well as network > > >>> bandwidth to > > >>> >> write to the SAN (thus it is slower and more resource intensive > > than a > > >>> >> SolidFire snapshot). > > >>> >> > > >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix) > > concerning > > >>> this > > >>> >> issue before and during the CloudStack Collaboration Conference in > > >>> Budapest > > >>> >> in November. He agreed that this is a legitimate issue with the > way > > >>> >> XenServer is designed and could not think of a way (other than > what > > I > > >>> was > > >>> >> doing) to get around it in current versions of XenServer. > > >>> >> > > >>> >> One thought is to have a feature added to XenServer that enables > > you to > > >>> >> change the UUID of an SR and of a VDI. > > >>> >> > > >>> >> If I could do that, then I could take a SolidFire snapshot of the > > >>> >> SolidFire volume and issue commands to XenServer to have it change > > the > > >>> >> UUIDs of the original SR and the original VDI. I could then > recored > > the > > >>> >> necessary UUID info in the CS DB. > > >>> >> > > >>> >> *** End of e-mail *** > > >>> >> > > >>> >> I have since investigated this on ESXi. > > >>> >> > > >>> >> ESXi does have a way for us to "re-signature" a datastore, so > > backend > > >>> >> snapshots can be taken and effectively used on this hypervisor. > > >>> >> > > >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield < > > >>> lbarfi...@tqhosting.com> > > >>> >> wrote: > > >>> >> > > >>> >>> I'm just going to stick with the qemu-img option change for RBD > for > > >>> >>> now (which should cut snapshot time down drastically), and look > > >>> >>> forward to this in the future. I'd be happy to help get this > > moving, > > >>> >>> but I'm not enough of a developer to lead the charge. > > >>> >>> > > >>> >>> As far as renaming goes, I agree that maybe backups isn't the > right > > >>> >>> word. That being said calling a full-sized copy of a volume a > > >>> >>> "snapshot" also isn't the right word. Maybe "image" would be > > better? > > >>> >>> > > >>> >>> I've also got my reservations about "accounts" vs "users" (I > think > > >>> >>> "departments" and "accounts or users" respectively is less > > confusing), > > >>> >>> but that's a different thread. > > >>> >>> > > >>> >>> Thank You, > > >>> >>> > > >>> >>> Logan Barfield > > >>> >>> Tranquil Hosting > > >>> >>> > > >>> >>> > > >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander < > > w...@widodh.nl> > > >>> >>> wrote: > > >>> >>> > > > >>> >>> > > > >>> >>> > On 16-02-15 15:38, Logan Barfield wrote: > > >>> >>> >> I like this idea a lot for Ceph RBD. I do think there should > > >>> still be > > >>> >>> >> support for copying snapshots to secondary storage as needed > > (for > > >>> >>> >> transfers between zones, etc.). I really think that this > could > > be > > >>> >>> >> part of a larger move to clarify the naming conventions used > for > > >>> disk > > >>> >>> >> operations. Currently "Volume Snapshots" should probably > > really be > > >>> >>> >> called "Backups". So having "snapshot" functionality, and a > > >>> "convert > > >>> >>> >> snapshot to backup/template" would be a good move. > > >>> >>> >> > > >>> >>> > > > >>> >>> > I fully agree that this would be a very great addition. > > >>> >>> > > > >>> >>> > I won't be able to work on this any time soon though. > > >>> >>> > > > >>> >>> > Wido > > >>> >>> > > > >>> >>> >> Thank You, > > >>> >>> >> > > >>> >>> >> Logan Barfield > > >>> >>> >> Tranquil Hosting > > >>> >>> >> > > >>> >>> >> > > >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic < > > >>> >>> andrija.pa...@gmail.com> wrote: > > >>> >>> >>> BIG +1 > > >>> >>> >>> > > >>> >>> >>> My team should submit some patch to ACS for better KVM > > snapshots, > > >>> >>> including > > >>> >>> >>> whole VM snapshot etc...but it's too early to give details... > > >>> >>> >>> best > > >>> >>> >>> > > >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky < > > >>> and...@arhont.com> > > >>> >>> wrote: > > >>> >>> >>> > > >>> >>> >>>> Hello guys, > > >>> >>> >>>> > > >>> >>> >>>> I was hoping to have some feedback from the community on the > > >>> subject > > >>> >>> of > > >>> >>> >>>> having an ability to keep snapshots on the primary storage > > where > > >>> it > > >>> >>> is > > >>> >>> >>>> supported by the storage backend. > > >>> >>> >>>> > > >>> >>> >>>> The idea behind this functionality is to improve how > snapshots > > >>> are > > >>> >>> >>>> currently handled on KVM hypervisors with Ceph primary > > storage. > > >>> At > > >>> >>> the > > >>> >>> >>>> moment, the snapshots are taken on the primary storage and > > being > > >>> >>> copied to > > >>> >>> >>>> the secondary storage. This method is very slow and > > inefficient > > >>> even > > >>> >>> on > > >>> >>> >>>> small infrastructure. Even on medium deployments using > > snapshots > > >>> in > > >>> >>> KVM > > >>> >>> >>>> becomes nearly impossible. If you have tens or hundreds > > >>> concurrent > > >>> >>> >>>> snapshots taking place you will have a bunch of timeouts and > > >>> errors, > > >>> >>> your > > >>> >>> >>>> network becomes clogged, etc. In addition, using these > > snapshots > > >>> for > > >>> >>> >>>> creating new volumes or reverting back vms also slow and > > >>> >>> inefficient. As > > >>> >>> >>>> above, when you have tens or hundreds concurrent operations > it > > >>> will > > >>> >>> not > > >>> >>> >>>> succeed and you will have a majority of tasks with errors or > > >>> >>> timeouts. > > >>> >>> >>>> > > >>> >>> >>>> At the moment, taking a single snapshot of relatively small > > >>> volumes > > >>> >>> (200GB > > >>> >>> >>>> or 500GB for instance) takes tens if not hundreds of > minutes. > > >>> Taking > > >>> >>> a > > >>> >>> >>>> snapshot of the same volume on ceph primary storage takes a > > few > > >>> >>> seconds at > > >>> >>> >>>> most! Similarly, converting a snapshot to a volume takes > tens > > if > > >>> not > > >>> >>> >>>> hundreds of minutes when secondary storage is involved; > > compared > > >>> with > > >>> >>> >>>> seconds if done directly on the primary storage. > > >>> >>> >>>> > > >>> >>> >>>> I suggest that the CloudStack should have the ability to > keep > > >>> volume > > >>> >>> >>>> snapshots on the primary storage where this is supported by > > the > > >>> >>> storage. > > >>> >>> >>>> Perhaps having a per primary storage setting that enables > this > > >>> >>> >>>> functionality. This will be beneficial for Ceph primary > > storage > > >>> on > > >>> >>> KVM > > >>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be > > supported > > >>> in > > >>> >>> a near > > >>> >>> >>>> future. > > >>> >>> >>>> > > >>> >>> >>>> This will greatly speed up the process of using snapshots on > > KVM > > >>> and > > >>> >>> users > > >>> >>> >>>> will actually start using snapshotting rather than giving up > > with > > >>> >>> >>>> frustration. > > >>> >>> >>>> > > >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast > your > > >>> vote > > >>> >>> if you > > >>> >>> >>>> are in agreement. > > >>> >>> >>>> > > >>> >>> >>>> Thanks for your input > > >>> >>> >>>> > > >>> >>> >>>> Andrei > > >>> >>> >>>> > > >>> >>> >>>> > > >>> >>> >>>> > > >>> >>> >>>> > > >>> >>> >>>> > > >>> >>> >>> > > >>> >>> >>> > > >>> >>> >>> -- > > >>> >>> >>> > > >>> >>> >>> Andrija Panić > > >>> >>> > > >>> >> > > >>> >> > > >>> >> > > >>> >> -- > > >>> >> *Mike Tutkowski* > > >>> >> *Senior CloudStack Developer, SolidFire Inc.* > > >>> >> e: mike.tutkow...@solidfire.com > > >>> >> o: 303.746.7302 > > >>> >> Advancing the way the world uses the cloud > > >>> >> <http://solidfire.com/solution/overview/?video=play>*™* > > >>> >> > > >>> > > > >>> > > > >>> > > > >>> > -- > > >>> > *Mike Tutkowski* > > >>> > *Senior CloudStack Developer, SolidFire Inc.* > > >>> > e: mike.tutkow...@solidfire.com > > >>> > o: 303.746.7302 > > >>> > Advancing the way the world uses the cloud > > >>> > <http://solidfire.com/solution/overview/?video=play>*™* > > >>> > > >> > > >> > > >> > > >> -- > > >> *Mike Tutkowski* > > >> *Senior CloudStack Developer, SolidFire Inc.* > > >> e: mike.tutkow...@solidfire.com > > >> o: 303.746.7302 > > >> Advancing the way the world uses the cloud > > >> <http://solidfire.com/solution/overview/?video=play>*™* > > >> > > > > > > > > > > > > -- > > > *Mike Tutkowski* > > > *Senior CloudStack Developer, SolidFire Inc.* > > > e: mike.tutkow...@solidfire.com > > > o: 303.746.7302 > > > Advancing the way the world uses the cloud > > > <http://solidfire.com/solution/overview/?video=play>*™* > > > > > > -- > *Ian Rae* > PDG *| *CEO > t *514.944.4008* > > *CloudOps* Votre partenaire infonuagique* | *Cloud Solutions Experts > w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|* Montreal *|* > Quebec *|* H3J 1S6 > > <https://www.cloud.ca/> > < > http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/ > > > -- *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: mike.tutkow...@solidfire.com o: 303.746.7302 Advancing the way the world uses the cloud <http://solidfire.com/solution/overview/?video=play>*™*