[RFC]Storage refactor

Edison Su Wed, 08 Aug 2012 15:24:49 -0700

Hi all,
        Seems a lot of people are interested in how to add new storage support 
into CloudStack. But frankly speaking, the current code is not well suited for 
such kind of tasks, in term of maintainability and flexibility.  
        Now it's time to improve or rewrite((if necessary) the existing storage 
code. Here are the goals I want to achieve in the next few months:  
                1. Easy to add new primary storage with the help of storage 
plugin framework.   
                2. Separate backup and snapshot. Today, snapshot in CloudStack 
is more like backup, we need real snapshot functionality, e.g. vm based/block 
based snapshot, revert a VM from a snapshot etc.   
                3. Easy to add backup storage, e.g. S3/swift etc.   
                4. Configurability. Should be easy to add configuration for 
each storage: such as, the scope of the storage(zone-wide, cluster, or group of 
clusters), what hypervisors it can support, what's the storage allocation 
policy etc.  
        
        Above are the main goals, but are not limited to. If you have other 
ideas/complains/requirements about this topic, please speak louder:)


        Following is the plan to actually implement above goals(in pre-alpha 
stage though):  
        the diagram:  
                                   |----->| storage service   |   
                 storage orchestration |----->| backup service    |  ---> 
storage plugins 
                                   |----->| image service     |  
                                   |----->| snapshot service  | ----> snapshot 
plugins 

     2. the functionality for each component:  
                a. storage orchestration: It provides storage interface for 
other components in CloudStack, and coordinate between the 
storage/snapshot/backup/image services.  
 
                b. storage service or primary storage service: Its main purpose 
is to create VM volume on storage. It will provide the following 
functionalities:  
                        ba1. storage life cycle management: How to add/delete a 
storage into CS, how to put/cancel storage in maintenance mode, how to 
disable/enable a storage pool  
                        ba2. volume life cycle management: how to create/delete 
volumes on a storage, how to cleanup volumes, how to migrate volumes  
                        ba3. the statistics of storage: the usage/IOPS etc.  
                        ba4. download template/iso into storage 
 
        c. snapshot service: provides interface to snapshot related operations: 
 
                        ca1. snapshot lifecycle management: 
create/delete/revert snapshot  
                        ca2. snapshot policy lifecycle management: 
create/delete/execute snapshot policy,   
 
        d. backup service: provides interface to backup snapshot/volume to 
backup storage.   
             The backup storage means it's a zone-wide storage, like the NFS 
secondary storage we have today. The functionalities:  
                 da1. create/delete backup from a snapshot or volume  
                 da2. can have different backup strategies, automate? daily? 
etc.  
 
        e. image service: provides template/iso/ import/export, copy 
template/iso/volumes between zones: 
                ea1. life cycle of template/iso management 
                ea2. share template/iso/volume between zones 
 
        Both above services may need help from storage plugins, as the storage 
plugin actually works on underneath storage itself. 
       
       The storage plugin will provide the following methods:  
           1. how to add/delete storage, how to enable/disable maintenance mode 
etc.  
           2. how to create/delete/migrate volume on this storage. For storage 
migration, the plugin needs to tell it supports vmotion like capabliity or not, 
if not, storage service will delegate the storage migration operation to backup 
service(like we did today, using secondary storage a temparary place)  
           3. the statistics  
           4. how to copy volumes between different storages, e.g. when 
downloading template from secondary storage to primary storage 
           5. the configurability of this storage:  
                      The scope of this storage, it can be zone-wide, or group 
of clusters(for object storages like ceph/sheepdog/gluster etc), or only for 
one cluster(NFS/VMFS/ISCSI etc)  
                      The hypervisors it can support  
                      what's the role? primary storage or secondary/backup 
storage? 
                      what's the protocol it supports(NFS/ISCSI/LVM/CIFS etc) 
                      Local or shared storage  
                      only for data storage? EBS like storage                   
  
                      The specific parameters need to config for this storage 
type, e.g. ceph needs authentication information, or tuning paramters for 
VMFS/NFS. Every time, when adding this storage type, users need to input this 
kind of configuration parameters from UI/API.  
                      threshold, different storage may have different 
used/total percentage ratio  
                     
          snapshot plugin has the functionalities:  
           1. interface to create/delete/revert snapshot  
           2. backup snapshot to other storages (backup service will this 
method) 
           2. configurability:  
               full/delta snapshot. If it's delta snapshot, need to know, 
how/when to coalesce snapshot  
               supports backup snapshot to other storages? 
 
        Let's take a look at what a storage plugin will look like. For example, 
Vendor A has storage that can be used as  primary storage, supports snapshot, 
and only works on vmware per cluster. The vendor A's storage plugin will 
implement both storage plugin interface and snapshot plugin interface. It has 
the following meta data in the plugin: 
       storage plugin vendor signature: vendor A 
       protocol: ISCSI LUN per VM 
       scope: per cluster 
       supported hypervisor: vmware 
       role: primary storage 
       support backup: no 
       support snapshot: yes 
       support data disk only: no 
       configuration names: server ip, username, password 
      After this plugin registered into cloudstack, when user adding a primary 
storage for a vmware cluster, UI will show the list of storage plugins 
available for vmware cluster. If user clicks vendor A's plugin, then 
configuration names of this plugin will be shown up on the UI. So user can type 
server ip/username/password information, then add primary storage. 
      If it's done, then cloudstack can create VM on this storage. E.g. during 
create VM, cloudstack will call storage orchestration, then call storage 
service, then call this storage plugin to create a volume. Maybe in this 
plugin, it will connect to storage, create a LUN, then send a command to the 
vmware hypervisor resource to create a datastore for this LUN, then create a 
volume on it, return to storage service. It's just an example, different 
storage plugin may use different ways to create volume. 
 
        Does it make sense? Any comments? I haven't work out the interfaces for 
each component yet, but I'd like to send the plan out at first, gather as much 
input as possible from you guys.  Thanks!

[RFC]Storage refactor

Reply via email to