On 08/09/2012 12:24 AM, Edison Su wrote:
Hi all,
Seems a lot of people are interested in how to add new storage support
into CloudStack. But frankly speaking, the current code is not well suited for
such kind of tasks, in term of maintainability and flexibility.
Now it's time to improve or rewrite((if necessary) the existing storage
code. Here are the goals I want to achieve in the next few months:
1. Easy to add new primary storage with the help of storage
plugin framework.
2. Separate backup and snapshot. Today, snapshot in CloudStack
is more like backup, we need real snapshot functionality, e.g. vm based/block
based snapshot, revert a VM from a snapshot etc.
3. Easy to add backup storage, e.g. S3/swift etc.
4. Configurability. Should be easy to add configuration for
each storage: such as, the scope of the storage(zone-wide, cluster, or group of
clusters), what hypervisors it can support, what's the storage allocation
policy etc.
Above are the main goals, but are not limited to. If you have other
ideas/complains/requirements about this topic, please speak louder:)
Seems like a great idea! One thing I should keep in mind is 'layering'
or 'golden image'
RBD and QCOW2 (and more?) support this.
You have a master/parent image where you create a child from. When the
instance first boots it only reads from the master volume, but the
writes are actually send to a different image.
In QCOW2 this is done by using a base image and in RBD this is done with
a parent image.
This allows you to roll out instances from a template in an instant! No
more waiting for qemu-img convert to finish, just have your disk
available instantly.
I'm pretty sure this is not limited to RBD and QCOW2, so it would be
wise to have this in.
Following is the plan to actually implement above goals(in pre-alpha
stage though):
the diagram:
|----->| storage service |
storage orchestration |----->| backup service | --->
storage plugins
|----->| image service |
|----->| snapshot service | ----> snapshot
plugins
2. the functionality for each component:
a. storage orchestration: It provides storage interface for
other components in CloudStack, and coordinate between the
storage/snapshot/backup/image services.
b. storage service or primary storage service: Its main purpose
is to create VM volume on storage. It will provide the following
functionalities:
ba1. storage life cycle management: How to add/delete a
storage into CS, how to put/cancel storage in maintenance mode, how to
disable/enable a storage pool
ba2. volume life cycle management: how to create/delete
volumes on a storage, how to cleanup volumes, how to migrate volumes
ba3. the statistics of storage: the usage/IOPS etc.
ba4. download template/iso into storage
c. snapshot service: provides interface to snapshot related operations:
ca1. snapshot lifecycle management:
create/delete/revert snapshot
ca2. snapshot policy lifecycle management:
create/delete/execute snapshot policy,
d. backup service: provides interface to backup snapshot/volume to
backup storage.
The backup storage means it's a zone-wide storage, like the NFS
secondary storage we have today. The functionalities:
da1. create/delete backup from a snapshot or volume
da2. can have different backup strategies, automate? daily?
etc
e. image service: provides template/iso/ import/export, copy
template/iso/volumes between zones:
ea1. life cycle of template/iso management
ea2. share template/iso/volume between zones
Storing ISO's and Templates is just object storage. Stuff like S3, Swift
and RADOS (Ceph) are very suiteable for these tasks.
I think that the current limitation to NFS should be taken care of,
since you don't always want NFS due to various reasons (for another topic).
Both above services may need help from storage plugins, as the storage
plugin actually works on underneath storage itself.
The storage plugin will provide the following methods:
1. how to add/delete storage, how to enable/disable maintenance
mode etc.
2. how to create/delete/migrate volume on this storage. For storage
migration, the plugin needs to tell it supports vmotion like capabliity or not,
if not, storage service will delegate the storage migration operation to backup
service(like we did today, using secondary storage a temparary place)
3. the statistics
4. how to copy volumes between different storages, e.g. when
downloading template from secondary storage to primary storage
5. the configurability of this storage:
The scope of this storage, it can be zone-wide, or group
of clusters(for object storages like ceph/sheepdog/gluster etc), or only for
one cluster(NFS/VMFS/ISCSI etc)
The hypervisors it can support
what's the role? primary storage or secondary/backup
storage?
what's the protocol it supports(NFS/ISCSI/LVM/CIFS etc)
Local or shared storage
only for data storage? EBS like storage
The specific parameters need to config for this storage
type, e.g. ceph needs authentication information, or tuning paramters for
VMFS/NFS. Every time, when adding this storage type, users need to input this
kind of configuration parameters from UI/API.
threshold, different storage may have different
used/total percentage ratio
snapshot plugin has the functionalities:
1. interface to create/delete/revert snapshot
2. backup snapshot to other storages (backup service will this
method)
2. configurability:
full/delta snapshot. If it's delta snapshot, need to know,
how/when to coalesce snapshot
supports backup snapshot to other storages?
Let's take a look at what a storage plugin will look like. For example,
Vendor A has storage that can be used as primary storage, supports snapshot,
and only works on vmware per cluster. The vendor A's storage plugin will
implement both storage plugin interface and snapshot plugin interface. It has
the following meta data in the plugin:
storage plugin vendor signature: vendor A
protocol: ISCSI LUN per VM
scope: per cluster
supported hypervisor: vmware
role: primary storage
support backup: no
support snapshot: yes
support data disk only: no
configuration names: server ip, username, password
After this plugin registered into cloudstack, when user adding a primary
storage for a vmware cluster, UI will show the list of storage plugins
available for vmware cluster. If user clicks vendor A's plugin, then
configuration names of this plugin will be shown up on the UI. So user can type
server ip/username/password information, then add primary storage.
If it's done, then cloudstack can create VM on this storage. E.g. during
create VM, cloudstack will call storage orchestration, then call storage
service, then call this storage plugin to create a volume. Maybe in this
plugin, it will connect to storage, create a LUN, then send a command to the
vmware hypervisor resource to create a datastore for this LUN, then create a
volume on it, return to storage service. It's just an example, different
storage plugin may use different ways to create volume.
Does it make sense? Any comments? I haven't work out the interfaces for
each component yet, but I'd like to send the plan out at first, gather as much
input as possible from you guys. Thanks!
We should however be able to have more configurable parameters for a
storage pool. NFS users might want to use different mount options like
UDP or TCP, rsize and wsize, etc.
For RBD there are tons of settings as well, so it would be useful to be
configure these.
Wido