> -----Original Message----- > From: Wido den Hollander [mailto:w...@widodh.nl] > Sent: Friday, January 18, 2013 12:51 AM > To: cloudstack-dev@incubator.apache.org > Subject: Re: new storage framework update > > Hi, > > On 01/16/2013 02:35 AM, Edison Su wrote: > > After a lengthy discussion(more than two hours) with John on Skype, I > think we figured out the difference between us. The API proposed by John > is more at the execution level, that's where input/output stream coming > from, which assumes that both source and destination object will be > operated at the same place(either inside ssvm, or on hypervisor host). While > the API I proposed is more about how to hook up vendor's own storage into > cloudstack's mgt server, thus can replace the process on how and where to > operate on the storage. > > Let's talk about the execution model at first, which will have huge impact > on the design we made. The execution model is about where to execute > operations issued by mgt server. Currently, there is no universal execution > model, it's quite different for each hypervisor. > > E.g. for KVM, mgt server will send commands to KVM host, there is a java > agent running on kvm host, which can execute command send by mgt server. > > For xenserver, most of commands will be executed on mgt server, which > will call xapi, then talking to xenserver host. But we do put some python > code at xenserver host, if there are operations not supported by xapi. > > For vmware, most of commands will be executed on mgt server, which > talking to vcenter API, while some of them will be executed inside SSVM. > > Due to the different execution models, we'll get into a problem about how > and where to access storage device. For example, there is a storage box, > which has its own management API to be accessed. Now I want to create a > volume on the storage box, where should I call stoage box's create volume > api? If we follow up above execution models, we need to call the api at > different places and even worse, you need to write the API call in different > languages. For kvm, you may need to write java code in kvm agent, for > xenserver, you may need to write a xapi python plugin, for vmware, you may > need to put the java code inside ssvm etc. > > But if the storage box already has management api, why just call it inside > cloudstack mgt server, then device vendor should just write java code once, > for all the different hypervisors? If we don't enforce the execution model, > then the storage framework should have a hook in management server, > device vendor can decide where to execute commands send by mgt server. > > With this you are assuming that the management server always has access to > the API of the storage box? > > What if the management server is in network X (say Amsterdam) en I have a > zone in London where my storage box X is in a private network. > > The only one that can access the API then is the hypervisor, so the calls have > to go through there. > > I don't want to encourage people to write "stupid code" where they assume > that the management server is this thing which is tied up into every network.
I think we will change the current mgt server deployment model to cluster of mgt servers per zone, instead of a cluster of mgt servers manage the whole zones: https://cwiki.apache.org/confluence/display/CLOUDSTACK/AWS-Style+Regions If above works, then mgt server can assume it can access storage box's API. BTW, the mgt server does need to access some private mgt API, such as F5/netscaler etc. > > Wido > > > That's my datastoredriver layer used for. Take taking snapshot diagram > > as an example: > > > https://cwiki.apache.org/confluence/download/attachments/30741569/take > > +snapshot+sequence.png?version=1&modificationDate=1358189965000 > > Datastoredriver is running inside mgt server, while datastoredriver itself > can decide where to execute "takasnapshot" API, driver can send a > command to hypervisor host, or directly call storage box's API, or directly > call > hypervisor's own API, or another service running outside of cloudstack mgt > server. It's all up to the implementation of driver. > > Does it make sense? If it's true, the device driver should not take > > input/out > stream as parameter, as it enforces the execution model, which I don't think > it's necessary. > > BTW, John and I will discuss the matter tomorrow on Skype, if you want to > join, please let me know. > > > >> -----Original Message----- > >> From: Edison Su [mailto:edison...@citrix.com] > >> Sent: Monday, January 14, 2013 3:19 PM > >> To: cloudstack-dev@incubator.apache.org > >> Subject: RE: new storage framework update > >> > >> > >> > >>> -----Original Message----- > >>> From: John Burwell [mailto:jburw...@basho.com] > >>> Sent: Friday, January 11, 2013 12:30 PM > >>> To: cloudstack-dev@incubator.apache.org > >>> Subject: Re: new storage framework update > >>> > >>> Edison, > >>> > >>> I think we are speaking past each other a bit. My intention is to > >>> separate logical and physical storage operations in order to > >>> simplify the implementation of new storage providers. Also, in > >>> order to support the widest range of storage mechanisms, I want to > >>> eliminate all interface assumptions (implied and explicit) that a > >>> storage device supports a file > >> > >> I think if the nfs secondary storage is optional, then all the > >> inefficient related to object storage will get away? > >> > >>> system. These two issues make implementation of efficient storage > >>> drivers extremely difficult. For example, for object stores, we > >>> have to create polling synchronization threads that add complexity, > >>> overhead, and latency to the system. If we could connect the > >>> OutputStream of a source (such as an HTTP > >>> upload) to the InputStream of the object store, transfer operations > >>> would be far simpler and efficient. The conflation of logical and > >>> physical operations also increases difficulty and complexity to > >>> reliably and maintainably implement cross-cutting storage features > >>> such as at-rest encryption. In my opinion, the current design in > >>> Javelin makes progress on the first point, but does not address the > >>> second point. Therefore, I propose that we refine the design to > >>> explicitly separate logical and physical operations and utilize the > >>> higher level I/O abstractions provided by the JDK to remove any > >>> interface > >> requirements for a file-based operations. > >>> > >>> Based on these goals, I propose keeping the logical Image, > >>> ImageMotion, Volume, Template, and Snapshot services. These > >>> services would be responsible for logical storage operations (.e.g > >>> createVolumeFromTemplate, downloadTemplate, createSnapshot, > >>> deleteSnapshot, etc). To perform physical operations, the > >>> StorageDevice concept would be added with the following operations: > >>> > >>> * void read(URI aURI, OutputStream anOutputStream) throws > >>> IOException > >>> * void write(URI aURI, InputStream anInputStream) throws > >>> IOException > >>> * Set<URI> list(URI aURI) throws IOException > >>> * boolean delete(URI aURI) throws IOException > >>> * StorageDeviceType getType() > >> > >> I agree with your simplified interface, but still cautious about the > >> simple URI may not enough. > >> For example, at the driver level, what about driver developer wants > >> to know extra information about the object being operated? > >> I ended up with new APIs like: > >> > https://cwiki.apache.org/confluence/download/attachments/30741569/pro > >> v > >> ider.jpg?version=1&modificationDate=1358168083079 > >> At the driver level, it works on two interfaces: > >> DataObject, which is the interface of volume/snapshot/template. > >> DataStore, which is the interface of all the primary storage or image > storage. > >> The API is pretty much looks like you proposed: > >> grantAccess(DataObject, EndPoint ep): make the object accessible for > >> an endpoint, and return an URI represent the object. This is used > >> during moving the object around different storages. For example, in > >> the sequence diagram, create volume from template: > >> > https://cwiki.apache.org/confluence/download/attachments/30741569/cre > >> a > tevolumeFromtemplate.png?version=1&modificationDate=1358172931767, > >> datamotionstrategy will call grantaccess on both source and > >> destination datastore, then got two URIs represent the source and > >> destination object, then send the URIs to endpoint(it can be the > >> agent running side ssvm, or it can be a hypervisor host) to conduct the > actual copy operation. > >> Revokeaccess: the opposite of above API. > >> listObjects(DataStore), list objects on datastore > >> createAsync(DataObject): create an object on datastore, the driver > >> shouldn't care about what's the object it is, but should only care > >> about the size of the object, the data store of the object, all of > >> these information can be directly inferred from DataObject. If the > >> driver needs more information about the object, driver developer can > >> get the id of the object, query database, then find about more > >> information. And this interface has no assumption about the > >> underneath storage, it can be primary storage, or s3/swift, or a ftp > >> server, > or whatever writable storage. > >> deleteAsync(DataObject): delete an object on a datastore, the > >> opposite of createAsync copyAsync(DataObject, DataObject): copy src > >> object to dest object. It's for storage migration. Some storage > >> vendor or hypervisor has its own efficient way to migrate storage > >> from one place to another. Most of the time, the migration across > >> different vendors or different storage types(primary <=> image > >> storage), needs to go to datamotionservice, which will be covered later. > >> canCopy(DataObject, DataObject): it helps datamotionservice to make > >> the decision on storage migration. > >> > >> For primary storage driver, there are extra two APIs: > >> takeSnapshot(SnapshotInfo snapshot): take snapshot > >> revertSnapshot(SnapshotInfo snapshot): revert snapshot. > >> > >> > >>> > >>> This interface does not mirror any that I am aware of the current JDK. > >>> Instead, it leverages the facilities it provides to abstract I/O > >>> operations between different types of devices (e.g. reading data > >>> from a socket and writing to a file or reading data from a socket > >>> and writing it to > >> another socket). > >>> Specifying the input or output stream allows the URI to remain > >>> logical and device agnostic because the device is being a physical > >>> stream from which to read or write with it. Therefore, specifying a > >>> logical URI without the associated stream would require implicit > >>> assumptions to be made by the StorageDevice and clients regarding > >>> data acquisition. To perform physical operations, one or more > >>> instances of StorageDevice would be passed into to the logical > >>> service methods to compose into a set of physical operations to > >>> perform logical operation (e.g. copying a template from secondary > storage to a volume). > >> > >> > >> I think our difference is only about the parameter of the API is an > >> URI or an Object. > >> Using an Object instead of a plain URI, using an object maybe more > >> flexible, and the DataObject itself has an API called: getURI, which > >> can translate the Object into an URI. See the interface of DataObject: > >> > https://cwiki.apache.org/confluence/download/attachments/30741569/dat > >> a > >> +model.jpg?version=1&modificationDate=1358171015660 > >> > >> > >>> > >>> StorageDevices are not intended to be content aware. They simply > >>> map logical URIs to the physical context they represent (a path on a > >>> filesystem, a bucket and key in an object store, a range of blocks > >>> in a block store, etc) and perform the requested operation on the > >>> physical context (i.e. read a byte stream from the physical location > >>> representing "/template/2/200", delete data represented by > >>> "/snapshot/3/300", list the contents of the physical location > >>> represented by "/volume/4/400", etc). In my opinion, it would be a > >>> misuse of a URI to infer an operation from their content. Instead, > >>> the VolumeService would expose a method such as the following to > >> perform the creation of a volume from a template: > >>> > >>> createVolumeFromTemplate(Template aTemplate, StorageDevice > >>> aTemplateDevice, Volume aVolume, StorageDevice aVolumeDevice, > >>> Hypervisor aHypervisor) > >>> > >>> The VolumeService would coordinate the creation of the volume with > >>> the passed hypervisor and, using the InputStream and OutputStreams > >>> provided by the devices, coordinate the transfer of data between the > >>> template storage device and the volume storage device. Ideally, the > >>> Template and Volume classes would encapsulate the rules for logical > >>> URI creation in a method. Similarly, the SnapshotService would > >>> expose the a method such as the following to take a snapshot of a > volume: > >>> > >>> createSnapshot(Volume aVolume, StorageDevice aSnapshotDevice) > >>> > >>> The SnapshotService would request the creation of a snapshot for the > >>> volume and then request a write of the snapshot data to the > >>> StorageDevice through the write method. > >> > >> I agree, the service has rich apis, while at the driver level, the > >> api should be as simple and neutral to the object operated on. > >> I updated the sequence diagrams: > >> create volume from template: > >> > https://cwiki.apache.org/confluence/download/attachments/30741569/cre > >> a > >> > tevolumeFromtemplate.png?version=1&modificationDate=1358172931767 > >> add template into image storage: > >> > https://cwiki.apache.org/confluence/download/attachments/30741569/reg > >> i > >> > ster+template+on+image+store.png?version=1&modificationDate=13581895 > >> 65551 > >> take snapshot: > >> > https://cwiki.apache.org/confluence/download/attachments/30741569/tak > >> e > >> +snapshot+sequence.png?version=1&modificationDate=1358189965438 > >> backup snapshot into image storage: > >> > https://cwiki.apache.org/confluence/download/attachments/30741569/bac > >> k > >> > up+snapshot+sequence.png?version=1&modificationDate=1358192407152 > >> > >> Could you help to review? > >> > >>> > >>> I hope these explanations clarify both the design and motivation of > >>> my proposal. I believe it is critical for the project's future > >>> development that the storage layer efficiently operate with storage > >>> devices that do not support traditional filesystems (e.g. object > >>> stores, raw block devices, etc). There are a fair number of these > >>> types of devices which CloudStack will likely need to support in the > >>> future. I believe that CloudStack will be well positioned to > >>> maintainability and efficiently support them if it carefully > >>> separates logical > >> and physical storage operations. > >> > >> Thanks for you feedback, I rewrite the API last weekend based on your > >> suggestion, and update the wiki: > >> > https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+subsys > >> t > >> em+2.0 > >> The code is starting, but not checked into javelin branch yet. > >> > >>> > >>> Thanks, > >>> -John > >>> > >>> On Jan 9, 2013, at 8:10 PM, Edison Su <edison...@citrix.com> wrote: > >>> > >>>> > >>>> > >>>>> -----Original Message----- > >>>>> From: John Burwell [mailto:jburw...@basho.com] > >>>>> Sent: Tuesday, January 08, 2013 8:51 PM > >>>>> To: cloudstack-dev@incubator.apache.org > >>>>> Subject: Re: new storage framework update > >>>>> > >>>>> Edison, > >>>>> > >>>>> Please see my thoughts in-line below. I apologize for S3-centric > >>>>> nature of my example in advance -- it happens to be top of mind > >>>>> for > >>> obvious reasons ... > >>>>> > >>>>> Thanks, > >>>>> -John > >>>>> > >>>>> On Jan 8, 2013, at 5:59 PM, Edison Su <edison...@citrix.com> wrote: > >>>>> > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: John Burwell [mailto:jburw...@basho.com] > >>>>>>> Sent: Tuesday, January 08, 2013 10:59 AM > >>>>>>> To: cloudstack-dev@incubator.apache.org > >>>>>>> Subject: Re: new storage framework update > >>>>>>> > >>>>>>> Edison, > >>>>>>> > >>>>>>> In reviewing the javelin, I feel that there is a missing abstraction. > >>>>>>> At the lowest level, storage operations are the storage, > >>>>>>> retrieval, deletion, and listing of byte arrays stored at a particular > URI. > >>>>>>> In order to implement this concept in the current Javelin > >>>>>>> branch, > >>>>>>> 3-5 strategy classes must implemented to perform the following > >>>>>>> low-level > >>>>> operations: > >>>>>>> > >>>>>>> * open(URI aDestinationURI): OutputStream throws IOException > >>>>>>> * write(URI aDestinationURI, OutputStream anOutputStream) > >> throws > >>>>>>> IOException > >>>>>>> * list(URI aDestinationURI) : Set<URI> throws IOException > >>>>>>> * delete(URI aDestinationURI) : boolean throws IOException > >>>>>>> > >>>>>>> The logic for each of these strategies will be identical which > >>>>>>> will lead to to the creation of a support class + glue code (i.e. > >>>>>>> either individual adapter classes > >>>>> > >>>>> I realize that I omitted a couple of definitions in my original > >>>>> email. First, the StorageDevice most likely would be implemented > >>>>> on a domain object that also contained configuration information > >>>>> for a resource. For example, the S3Impl class would also > >>>>> implement StorageDevice. On reflection (and a little pseudo > >>>>> coding), I would also like to refine my original proposed StorageDevice > interface: > >>>>> > >>>>> * void read(URI aURI, OutputStream anOutputStream) throws > >>> IOException > >>>>> * void write(URI aURI, InputStream anInputStream) throws > >> IOException > >>>>> * Set<URI> list(URI aURI) throws IOException > >>>>> * boolean delete(URI aURI) throws IOException > >>>>> * StorageDeviceType getType() > >>>>> > >>>>>> > >>>>>> If the lowest api is too opaque, like one URI as parameter, I am > >>>>>> wondering > >>>>> it may make the implementation more complicated than it sounds. > >>>>>> For example, there are at least 3 APIs for primary storage driver: > >>>>> createVolumeFromTemplate, createDataDisk, deleteVolume, and > two > >>>>> snapshot related APIs: createSnapshot, deleteSnapshot. > >>>>>> How to encode above operations into simple write/delete APIs? If > >>>>>> one URI > >>>>> contains too much information, then at the end of day, the > >>>>> receiver side(the code in hypervisor resource), who is responsible > >>>>> to decode the URI, is becoming complicated. That's the main > >>>>> reason, I decide to use more specific APIs instead of one opaque URI. > >>>>>> That's true, if the API is too specific, people needs to > >>>>>> implement ton of > >>>>> APIs(mainly imagedatastoredirver, primarydatastoredriver, > >>>>> backupdatastoredriver), and all over the place. > >>>>>> Which one is better? People can jump into discuss. > >>>>>> > >>>>> > >>>>> The URI scheme should be a logical, unique, and reversal values > >>>>> associated with the type of resource being stored. For example, > >>>>> the general form of template URIs would > >>>>> "/template/<account_id>/<template_id>/template.properties" and > >>>>> "/template/<account_id>/<template_id>/<uuid>.vhd" . Therefore, > >>>>> for account id 2, template id 200, the template.properties > >>>>> resource would be assigned a URI of > "/template/2/200/template.properties. > >>>>> The StorageDevice implementation translates the logical URI to a > >>>>> physical representation. Using > >>>>> S3 as an example, the StorageDevice is configured to use bucket > >>>>> jsb- cloudstack at endpoint s3.amazonaws.com. The S3 storage > >>>>> device would translate the URI to s3://jsb- > >>>>> cloudstack/templates/2/200/template.properties. For an NFS > >>>>> storage device mounted on nfs://localhost/cloudstack, the > >>>>> StorageDevice would translate the logical URI to > >>>>> > >>> > hfs://localhost/cloudstack/template/<account_id>/<template_id>/templ > >>> a > >>>>> te .properties. In short, I believe that we can devise a simple > >>>>> scheme that allows the StorageDevice to treat the URI path > >>>>> relative to its root. > >>>>> > >>>>> To my mind, the createVolumeFromTemplate is decomposable into a > >>>>> series of StorageDevice#read and StorageDevice#write operations > >>>>> which would be issued by the VolumeManager service such as the > >> following: > >>>>> > >>>>> public void createVolumeFromTemplate(Template aTemplate, > >>>>> StorageDevice aTemplateDevice, Volume aVolume, StorageDevice > >>>>> aVolumeDevice) { > >>>>> > >>>>> try { > >>>>> > >>>>> if (aVolumeDevice.getType() != StorageDeviceType.BLOCK || > >>>>> aVolumeDevice.getType() != StorageDeviceType.FILE_SYSTEM) > { throw > >>> new > >>>>> UnsupportedStorageDeviceException(...); > >>>>> } > >>>>> > >>>>> // Pull the template from template device into a temporary > >>>>> directory final File aTemplateDirectory = new File(<template temp > >>>>> path>) > >>>>> > >>>>> // Non-DRY -- likely a candidate for a > >>>>> TemplateService#downloadTemplate method > >>> aTemplateDevice.read(new > >>>>> > URI("/templates/<account_id>/<template_id>/template.properties"), > >>> new > >>>>> FileOutStream(aTemplateDirectory.createFille("template.properties" > >>>>> ) > >>>>> ); > >>>>> aTemplate.read(new > >>>>> > URI("/templates/<account_id>/<template_id>/<template_uuid>.vhd"), > >>>>> new > >>>>> > >> FileOutputStream(aTemplateDirectory.createFile("<template_uuid>.vhd > >>>>> ") > >>>>> ; > >>>>> > >>>>> // Perform operations with hypervisor as necessary to register > >>>>> storage which yields // anInputStream (possibly a > >>>>> List<InputStream>) > >>>>> > >>>>> aVolumeDevice.write(new > URI("/volume/<account_id>/<volume_id>", > >>>>> anInputStream); > >>>> > >>>> > >>>> Not sure we really need the API looks like java IO, but I can see > >>>> the value of using URI to encode objects(volume/snapshot/template > etc): > >>> driver layer API will be very simple, and can be shared by multiple > >>> components(volume/image services etc) Currently, there is one > >>> datastore object for each storage, the datastore object mainly used > >>> by cloudstack mgt server, to read/write database, and to maintain > >>> the state of each > >>> object(volume/snapshot/template) in the datastore. And the datastore > >>> object also provides interface for lifecycle management, and a > >>> transformer(which can transform a db object into a *TO, or an URI). > >>> The purpose of datastore object is that, I want to offload a lot of > >>> logic from volume/template manager into each object, as the manager > >>> is a singleton, which is not easy to be extended. > >>>> The relationship between these classes are: > >>>> For volume service: Volumeserviceimpl -> primarydatastore -> > >>>> primarydatastoredriver For image service: imageServiceImpl -> > >>>> imagedataStore -> imagedataStoredriver For snapshot service: > >>> snapshotServiceImpl -> {primarydataStore/imagedataStore} - > > >>> {primarydatastoredriver/imagedatastoredriver}, the snapshot can be > >>> on both primarydatastore and imagedatastore. > >>>> > >>>> The current driver API is not good enough, it's too specific for each > object. > >>> For example, there will be an API called createsnapshot in > >>> primarydatastoredriver, and an API called moveSnapshot in > >>> imagedataStoredriver(in order to implement moving snapshot from > >>> primary storage to image store ), also may have an API called, > >>> createVolume in primarydatastoredriver, and an API called moveVolume > >>> in imagedatastoredriver(in order to implement moving volume from > >>> primary to image store). The more objects we add, the driver API > >>> will be > >> bloated. > >>>> > >>>> If driver API is using the model you suggested, the simple > >>> read/write/delete with URI, for example: > >>>> void Create(URI uri) throws IOException void copy(URI desturi, URI > >>>> srcUri) throws IOException boolean delete(URI uri) throws > >>>> IOException set<URI> list(URI uri) throws IOException > >>>> > >>>> create API has multiple means under different context: if the URI > >>>> has > >>> "*/volume/*" means creating volume, if URI has "*/template" means > >>> creating template, and so on. > >>>> The same for copy API: > >>>> if both destUri and srcUri is volume, it can have different > >>>> meanings, if both > >>> volumes are in the same storage, means create a volume a from a base > >>> volume. If both are in the different storages, means volume migration. > >>>> If destUri is a volume, while the srcUri is a template, means, > >>>> create a > >>> volume from template. > >>>> If destUri is a volume, srcUri is a snapshot and on the same > >>>> storage, means revert snapshot If destUri is a volume, srcUri is a > >>>> snapshot, but on > >>> the different storages, means create volume from snapshot. > >>>> If destUri is a snapshot, srcUri is a volume, means create snapshot > >>>> from > >>> volume. > >>>> If destUri is a snapshot, srcUri is a snapshot, but on the > >>>> different places, > >>> means snapshot backup. > >>>> If destUri is a template, srcUri is a snapshot, means create > >>>> template from > >>> snapshot. > >>>> As you can see, the API is too opaque, needs a complicated logic to > >>>> encode > >>> and decode the URIs. > >>>> Are you OK with above API? > >>>> > >>>>> > >>>>> } catch (IOException e) { > >>>>> > >>>>> // Log and handle the error ... > >>>>> > >>>>> } finally { > >>>>> > >>>>> // Close resources ... > >>>>> > >>>>> } > >>>>> > >>>>> } > >>>>> > >>>>> Dependent on the capabilities of the hypervisor's Java API, the > >>>>> temporary files may not be required, and an OutputStream could > >>>>> copied directly to an InputStream. > >>>>> > >>>>>> > >>>>>>> or a class that implements a ton of interfaces). In addition to > >>>>>>> this added complexity, this segmented approach prevents the > >>>>> implementation > >>>>>>> of common, logical storage features such as ACL enforcement and > >>>>>>> asset > >>>>>> > >>>>>> This is a good question, how to share the code across multiple > >>> components. > >>>>> For example, one storage can be used as both primary storage and > >>>>> backup storage. In the current code, developer needs to implement > >>>>> both primarydataStoredriver and backupdatastoredriver, in order to > >>>>> share code between these two drivers if needed, I think developer > >>>>> can write one driver which implements both interfaces. > >>>>> > >>>>> In my opinion, storage drivers classifying their usage limits > >>>>> functionality and composability. Hence, my thought is that the > >>>>> StorageDevice should describe its capabilities -- allowing the > >>>>> various services (e.g. Image, Template, Volume, > >>>>> etc) to determine whether or not the passed storage devices can > >>>>> support the requested operation. > >>>>> > >>>>>> > >>>>>>> encryption. With a common representation of a StorageDevice > >>>>>>> that operates on the standard Java I/O model, we can layer in > >>>>>>> cross-cutting storage operations in a consistent manner. > >>>>>> > >>>>>> I agree that nice to have a standard device model, like the POSIX > >>>>>> file > >>>>> system API in Unix world. But I haven't figure out how to > >>>>> generalized all the operations on the storage, as I mentioned above. > >>>>>> I can think about, createvolumefromtemplate, can be generalized > >>>>>> as link > >>>>> api, but how about taking snapshot? How about who will handle the > >>>>> difference between delete voume and delete snapshot, if they are > >>>>> using the same delete API? > >>>>> > >>>>> The following is an snippet that would be part of the > >>>>> SnapshotService to take a snapshot: > >>>>> > >>>>> // Ask the hypervisor to take a snapshot yields anInputStream > >>>>> (e.g. > >>>>> FileInputStream) > >>>>> > >>>>> aSnapshotDevice.write(new > >>>>> URI("/snapshots/<account_id>/<snapshot_id>), anInputStream) > >>>>> > >>>>> Ultimately, a snapshot can be exported to a single file or > >>>>> OutputStream which can written back out to a StorageDevice. For > >>>>> deleting a snapshot, the following snippet would perform the > >>>>> deletion in > >>> the SnapshotService: > >>>>> > >>>>> // Ask the hypervisor to delete the snapshot ... > >>>>> > >>>>> aSnapshotDevice.delete(new > >>>>> URI("/snapshots/<account_id>/<snapshot_id>")) > >>>>> > >>>>> Finally, deleting a volume, the following snippet would delete a > >>>>> volume from > >>>>> VolumeService: > >>>>> > >>>>> // Ask the hypervisor to delete the volume > >>>>> > >>>>> aVolumeDevice.delete(new > >>>>> URI("/volumes/<account_id>/<volume_id>")) > >>>>> > >>>>> In summary, I believe that the opaque operations specified in the > >>>>> StorageDevice interface can accomplish these goals if the > >>>>> following approaches are employed: > >>>>> > >>>>> * Logical, reversible URIs are constructed by the storage > >>>>> services. > >>>>> These URIs are translated by the StorageDevice implementation to > >>>>> the semantics of the underlying device > >>>>> * The storage service methods break their logic down into a > >>>>> series operations against one or more StorageDevices. These > >>>>> operations should conform to common Java idioms because > >>> StorageDevice > >>>>> is built on the standard Java I/O model (i.e. InputStream, > >>>>> OutputStream, > >>> URI). > >>>>> > >>>>> Thanks, > >>>>> -John > >>>>> > >>>>>> > >>>>>>> > >>>>>>> Based on this line of thought, I propose the addition of > >>>>>>> following notions to the storage framework: > >>>>>>> > >>>>>>> * StorageType (Enumeration) > >>>>>>> * BLOCK (raw block devices such as iSCSI, NBD, etc) > >>>>>>> * FILE_SYSTEM (devices addressable through the filesystem > >>>>>>> such as local disks, NFS, etc) > >>>>>>> * OBJECT (object stores such as S3 and Swift) > >>>>>>> * StorageDevice (interface) > >>>>>>> * open(URI aDestinationURI): OutputStream throws > IOException > >>>>>>> * write(URI aDestinationURI, OutputStream anOutputStream) > >>>>>>> throws IOException > >>>>>>> * list(URI aDestinationURI) : Set<URI> throws IOException > >>>>>>> * delete(URI aDestinationURI) : boolean throws IOException > >>>>>>> * getType() : StorageType > >>>>>>> * UnsupportedStorageDevice (unchecked exception): Thrown > when > >>> an > >>>>>>> unsuitable device type is provided to a storage service. > >>>>>>> > >>>>>>> All operations on the higher level storage services (e.g. > >>>>>>> ImageService) would accept a StorageDevice parameter on their > >>>>>>> operations. Using the type property, services can determine > >>>>>>> whether or not the passed device is an suitable (e.g. guarding > >>>>>>> against the use object store such as S3 as VM disk) -- throwing > >>>>>>> an UnsupportedStorageDevice exception when a device unsuitable > >>>>>>> for > >>> the > >>>>>>> requested operation. The services would then perform all > >>>>>>> storage > >>>>> operations through the passed StorageDevice. > >>>>>>> > >>>>>>> One potential gap is security. I do not know whether or not > >>>>>>> authorization decisions are assumed to occur up the stack from > >>>>>>> the storage engine or as part of it. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> -John > >>>>>>> > >>>>>>> P.S. I apologize for taking so long to push my feedback. I am > >>>>>>> just getting back on station from the birth of our second child. > >>>>>> > >>>>>> > >>>>>> Congratulation! Thanks for your great feedback. > >>>>>> > >>>>>>> > >>>>>>> On Dec 28, 2012, at 8:09 PM, Edison Su <edison...@citrix.com> > wrote: > >>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Marcus Sorensen [mailto:shadow...@gmail.com] > >>>>>>>>> Sent: Friday, December 28, 2012 2:56 PM > >>>>>>>>> To: cloudstack-dev@incubator.apache.org > >>>>>>>>> Subject: Re: new storage framework update > >>>>>>>>> > >>>>>>>>> Thanks. I'm trying to picture how this will change the existing > code. > >>>>>>>>> I think it is something i will need a real example to understand. > >>>>>>>>> Currently we pass a > >>>>>>>> Yah, the example code is in these files: > >>>>>>>> XenNfsConfigurator > >>>>>>>> DefaultPrimaryDataStoreDriverImpl > >>>>>>>> DefaultPrimaryDatastoreProviderImpl > >>>>>>>> VolumeServiceImpl > >>>>>>>> DefaultPrimaryDataStore > >>>>>>>> XenServerStorageResource > >>>>>>>> > >>>>>>>> You can start from volumeServiceTest -> > >> createVolumeFromTemplate > >>>>>>>> test > >>>>>>> case. > >>>>>>>> > >>>>>>>>> storageFilerTO and/or volumeTO from the serverto the agent, > >>>>>>>>> and the agent > >>>>>>>> These model is not changed, what changed are the commands > send > >>> to > >>>>>>> resource. Right now, each storage protocol can send it's own > >>>>>>> command to resource. > >>>>>>>> All the storage related commands are put under > >>>>>>> org.apache.cloudstack.storage.command package. Take > >>>>>>> CopyTemplateToPrimaryStorageCmd as an example, > >>>>>>>> It has a field called ImageOnPrimayDataStoreTO, which contains > >>>>>>>> a > >>>>>>> PrimaryDataStoreTO. PrimaryDataStoreTO contains the basic > >>>>>>> information about a primary storage. If needs to send extra > >>>>>>> information to resource, one can subclass PrimaryDataStoreTO, e.g. > >>>>>>> NfsPrimaryDataStoreTO, which contains nfs server ip, and nfs path. > >>>>>>> In this way, one can write a CLVMPrimaryDataStoreTO, which > >>>>>>> contains clvm's > >>>>> own special information if > >>>>>>> needed. Different protocol uses different TO can simply the code, > >> and > >>>>>>> easier to add new storage. > >>>>>>>> > >>>>>>>>> does all of the work. Do we still need things like > >>>>>>>>> LibvirtStorageAdaptor to do the work on the agent side of > >>>>>>>>> actually managing the volumes/pools and implementing them, > >>>>>>>>> connecting > >>>>> them > >>>>>>> to > >>>>>>>>> vms? So in implementing new storage we will need to write both > >>>>>>>>> a configurator and potentially a storage adaptor? > >>>>>>>> > >>>>>>>> Yes, that's minimal requirements. > >>>>>>>> > >>>>>>>>> On Dec 27, 2012 6:41 PM, "Edison Su" <edison...@citrix.com> > >> wrote: > >>>>>>>>> > >>>>>>>>>> Hi All, > >>>>>>>>>> Before heading into holiday, I'd like to update the > >>>>>>>>>> current status of the new storage framework since last collab12. > >>>>>>>>>> 1. Class diagram of primary storage is evolved: > >>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>> > >>> > >> > https://cwiki.apache.org/confluence/download/attachments/30741569/sto > >>>>>>>>> r > >>>>>>>>> age.jpg?version=1&modificationDate=1356640617613 > >>>>>>>>>> Highlight the current design: > >>>>>>>>>> a. One storage provider can cover multiple storage > >>>>>>>>>> protocols for multiple hypervisors. The default storage > >>>>>>>>>> provider can almost cover all the current primary storage > >>>>>>>>>> protocols. In most of cases, you don't need to write a new > >>>>>>>>>> storage provider, what you need to do is to write a new > >>>>>>>>>> storage > >> configurator. > >>>>>>>>>> Write a new storage provider needs to write a lot of code, > >>>>>>>>>> which we should avoid it as much as > >>>>>>>>> possible. > >>>>>>>>>> b. A new type hierarchy, primaryDataStoreConfigurator, > >>>>>>>>>> is > >>> added. > >>>>>>>>>> The configurator is a factory for primaryDataStore, which > >>>>>>>>>> assemble StorageProtocolTransformer, > >> PrimaryDataStoreLifeCycle > >>>>>>>>>> and PrimaryDataStoreDriver for PrimaryDataStore object, > based > >>>>>>>>>> on the hypervisor type and the storage protocol. For > >>>>>>>>>> example, for nfs primary storage on xenserver, there is a > >>>>>>>>>> class called XenNfsConfigurator, which put > >>>>>>>>>> DefaultXenPrimaryDataStoreLifeCycle, > >>>>>>>>>> NfsProtocolTransformer and > DefaultPrimaryDataStoreDriverImpl > >>>>>>>>>> into DefaultPrimaryDataStore. One provider can only have one > >>>>>>>>>> configurator for a pair of hypervisor type and storage protocol. > >>>>>>>>>> For example, if you want to add a new nfs protocol > >>>>>>>>>> configurator for xenserver hypervisor, you need to write a > >>>>>>>>>> new > >> storage provider. > >>>>>>>>>> c. A new interface, StorageProtocolTransformer, is added. > >>>>>>>>>> The main purpose of this interface is to handle the > >>>>>>>>>> difference between different storage protocols. It has four > methods: > >>>>>>>>>> getInputParamNames: return a list of name of > >>>>>>>>>> parameters for a particular protocol. E.g. NFS protocol has > >>>>>>>>>> ["server", "path"], ISCSI has ["iqn", "lun"] etc. UI > >>>>>>>>>> shouldn't hardcode these parameters any > >>>>>>>>> more. > >>>>>>>>>> normalizeUserInput: given a user input from > >>>>>>>>>> UI/API, need to validate the input, and break it apart, then > >>>>>>>>>> store them into > >>>>> database > >>>>>>>>>> getDataStoreTO/ getVolumeTO: each protocol can > >>>>>>>>>> have its own volumeTO and primaryStorageTO. TO is the object > >>>>>>>>>> will be passed down to resource, if your storage has extra > >>>>>>>>>> information you want to pass to resource, these two methods > >>>>>>>>>> are the place you can > >>>>> override. > >>>>>>>>>> d. All the time-consuming API calls related to storage is > >>>>>>>>>> async. > >>>>>>>>>> > >>>>>>>>>> 2. Minimal functionalities are implemented: > >>>>>>>>>> a. Can register a http template, without SSVM > >>>>>>>>>> b. Can register a NFS primary storage for xenserver > >>>>>>>>>> c. Can download a template into primary storage directly > >>>>>>>>>> d. Can create a volume from a template > >>>>>>>>>> > >>>>>>>>>> 3. All about test: > >>>>>>>>>> a. TestNG test framework is used, as it can provide > >>>>>>>>>> parameter for each test case. For integration test, we need > >>>>>>>>>> to know ip address of hypervisor host, the host uuid(if it's > >>>>>>>>>> xenserver), the primary storage url, the template url etc. > >>>>>>>>>> These configurations are better to be parameterized, so for > >>>>>>>>>> each test run, we don't need to modify test case itself, > >>>>>>>>>> instead, we provide a test configuration file for each test > >>>>>>>>>> run. TestNG framework already has this functionality, I just > >>>>>>>>> reuse it. > >>>>>>>>>> b. Every pieces of code can be unit tested, which means: > >>>>>>>>>> b.1 the xcp plugin can be unit tested. I wrote > >>>>>>>>>> a small python code, called mockxcpplugin.py, which can > >>>>>>>>>> directly call xcp > >>>>>>>>> plugin. > >>>>>>>>>> b.2 direct agent hypervisor resource can be tested. > >>>>>>>>>> I wrote a mock agent manger, which can load and initialize > >>>>>>>>>> hypervisor resource, and also can send command to resource. > >>>>>>>>>> b.3 a storage integration test maven project > >>>>>>>>>> is created, which can test the whole storage subsystem, such > >>>>>>>>>> as create volume from template, which including both image > >>>>>>>>>> and volume > >>>>>>>>> components. > >>>>>>>>>> A new section, called "how to test", is added into > >>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>> > >>> > >> > https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+subsys > >>>>>>>>> t > >>>>>>>>>> em+2.0, > >>>>>>>>>> please check it out. > >>>>>>>>>> > >>>>>>>>>> The code is on the javelin branch, the maven projects > >>>>>>>>>> whose name starting from cloud-engine-storage-* are the > code > >>>>>>>>>> related to storage subsystem. Most of the primary storage > >>>>>>>>>> code is in cloud-engine-storage-volume project. > >>>>>>>>>> Any feedback/comment is appreciated. > >>>>>>>>>> > >>>>>> > >>>> > >