>>> My question is, would it make sense to add to the current mechanisms in >>> Nova and Cinder than add the complexity of a new project? > > I think the answer is yes :)
I meant there is a clear need for Raksha project. :) Thanks, Murali Balcha On Aug 29, 2013, at 7:45 PM, "Murali Balcha" <murali.bal...@triliodata.com> wrote: > > ________________________________________ >>> From: Ronen Kat <ronen...@il.ibm.com> >>> Sen: Thursday, August 29, 2013 2:55 PM >>> To: openstack-dev@lists.openstack.org; openstack-...@lists.launchpad.net >>> Subject: Re: [openstack-dev] Proposal for Raksha, a Data Protection As a >>> Service project > >>> Hi Murali, > >>> I think the idea to provide enhanced data protection in OpenStack is a >>> great idea, and I have been thinking about backup in OpenStack for a while >>> now. >>> I just not sure a new project is the only way to do. > >>> (as disclosure, I contributed code to enable IBM TSM as a Cinder backup >>> driver) > > Hi Kat, > Consider the following use cases that Raksha will addresses. I will discuss > from simple to complex use case and then address your specific questions with > inline comments. > 1. VM1 that is created on the local file system with a cinder volume > attached > 2. VM2 that is booted off from a cinder volume and has couple of cinder > volumes attached > 3. VM1 and VM2 all booted from cinder volumes and has couple of volumes > attached. They also share a private network for internal communication. > 4. > In all these cases Raksha will take a consistent snap of VMs, walk thru each > VM resources and backup the resources to swift end point. > In case 1, that means backup VM image and Cinder volume image to swift > In case 2 is an extension of case 1. > In case 3, Raksha not only backup VM1 and VM2 and its associated resources, > it also backup the network configuration > > Now lets consider the restore case. The restore operation walks thru the > backup resources and calls into respective openstack services to restore > those objects. In case1, it first calls Nova API to restore the VM, it calls > into Cinder to restore the volume and attach the volume to the newly restored > VM instance. In case of 3, it also calls into Neutron API to restore the > networking. Hence my argument is that not one OpenStack project has a global > view of VM and all its resources to implement an effective backup and restore > services. > > >>> I wonder what is the added-value of a project approach versus enhancements >>> to the current Nova and Cinder implementations of backup. Let me elaborate. > >>> Nova has a "nova backup" feature that performs a backup of a VM to Glance, >>> the backup is managed by tenants in the same way that you propose. >>> While today it provides only point-in-time full backup, it seems reasonable >>> that it can be extended support incremental and consistent backup as well - >>> as the actual work is done either by the Storage or Hypervisor in any case. > > Though Nova has API to upload a snapshot of the VM to glance, it does not > snapshot any volumes associated with the VM. When a snapshot is uploaded to > glance, Nova creates an image by collapsing the qemu image with delta file > and uploads the larger file to glance. If we were to perform periodic backups > of VMs, this is a very inefficient way to do backup. Also having to manage > two end points, one for Nova and Cinder is inefficient. These are the gaps I > called out in Raksha wiki page. > > >>> Cinder has a cinder backup command that performs a volume backup to Swift, >>> Ceph or TSM. The Ceph implementation also support incremental backup (Ceph >>> to Ceph). >>> I envision that Cinder could be expanded to support incremental backup (for >>> persistent storage) by adding drivers/plug-ins that will leverage >>> incremental backup features of either the storage or Hypervisors. >>> Independently, in Havana the ability to do consistent volume snapshots was >>> added to GlusterFS. I assume that this consistency support could be >>> generalized to support other volume drivers, and be utilized as part of a >>> backup code. > > I think we are talking specific implementations here. Yes, I am aware of Ceph > blueprint to support incremental backup, but Cinder backup APIs are volume > specific. That means if a VM has multiple volumes mapped as in the case 2 I > discussed, tenant need to call backup api three times. Also if you look at > the swift layout of the cinder, it is very difficult to tie the swift images > back to a particular VM. Imagine a tenant were to restore a VM and all its > resources from a backup copy that was performed a week ago. The restore > operation is not straight forward. > It is my understanding that consistency should be maintained at the VM, not > at individual volume. It is very difficult to assume how the application data > inside VM is laid out. > >>> Looking at the key features in Raksha, it seems that the main features >>> (2,3,4,7) could be addressed by improving the current mechanisms in Nova >>> and Cinder. I didn't included 1 as a feature as it is more a statement of >>> intent (or goal) than a feature. >>> Features 5 (dedup) and 6 (scheduler) are indeed new in your proposal. > >>> Looking at the source de-duplication feature, and taking Swift as an >>> example, it seems reasonable that if Swift will implement de-duplication, >>> then doing backup to Swift will give us de-duplication for free. >>> In fact it would make sense to do the de-duplication at the Swift level >>> instead of just the backup layer to gain more duplication opportunities. > > I agree, however Swift is not the only object store that need to support > dedupe. Ceph is another popular object store too. GlusterFS supports Swift > end point and there are other commercially available object stores too. So > you argument becomes very product specific. However source level dedupes is > different than dedupe at rest. Source level dedupe reduces the backup windows > and also reduces the amount of data that need to be pumped to backup end > point like swift. > >>> Following the above, and assuming it all come true (at times I am known to >>> be an optimistic), then we are left with backup job scheduling, and I >>> wonder if that is enough for a new project. > > I hope I convinced that Raksha has more to offer than a simple cron job. > Please take a look at the backup apis, its database schema and the usecases > it addresses in its wiki page. > > > Bottom line is irrespective how OpenStack is deployed; here is how Raksha > workflow looks like > * Create-backupjob VM1, VM2 > --> Returns backup job id, id1 > * Run-backupjob id1 > --> Returns runid rid1 > * Run backup job id1 > --> Returns run id rid2 > > * Restore rid1 > --> Restores PiT of VM1 and VM2 and its associated volumes > > >>> My question is, would it make sense to add to the current mechanisms in >>> Nova and Cinder than add the complexity of a new project? > > I think the answer is yes :) > > Regards, > Murali Balcha >>> __________________________________________ >>> Ronen I. Kat >>> Storage Research > IBM Research - Haifa > Phone: +972.3.7689493 > Email: ronen...@il.ibm.com > > From: Murali Balcha <murali.bal...@triliodata.com> > To: "openstack-dev@lists.openstack.org" > <openstack-dev@lists.openstack.org>, > "openst...@list.openstack.org" <openst...@list.openstack.org>, > Date: 29/08/2013 01:18 AM > Subject: [openstack-dev] Proposal for Raksha, a Data Protection As a > Service project > > > > Hello Stackers, > We would like to introduce a new project Raksha, a Data Protection As a > Service (DPaaS) for OpenStack Cloud. > Raksha’s primary goal is to provide a comprehensive Data Protection for > OpenStack by leveraging Nova, Swift, Glance and Cinder. Raksha has > following key features: > 1. Provide an enterprise grade data protection for OpenStack > based clouds > 2. Tenant administered backups and restores > 3. Application consistent backups > 4. Point In Time(PiT) full and incremental backups and restores > 5. Dedupe at source for efficient backups > 6. A job scheduler for periodic backups > 7. Noninvasive backup solution that does not require service > interruption during backup window > > You will find the rationale behind the need for Raksha in OpenStack in its > Wiki. The wiki also has the preliminary design and the API description. > Some of the Raksha functionality may overlap with Nova and Cinder projects > and as a community lets work together to coordinate the features among > these projects. We would like to seek out early feedback so we can address > as many issues as we can in the first code drop. We are hoping to enlist > the OpenStack community help in making Raksha a part of OpenStack. > Raksha’s project resources: > Wiki: https://wiki.openstack.org/wiki/Raksha > Launchpad: https://launchpad.net/raksha > Github: https://github.com/DPaaS-Raksha/Raksha (We will upload a prototype > code in few days) > If you want to talk to us, send an email to > openstack-...@lists.launchpad.net with "[raksha]" in the subject or use > #openstack-raksha irc channel. > > Best Regards, > Murali Balcha_______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev