Edison, Please see my thoughts in-line below. I apologize for S3-centric nature of my example in advance -- it happens to be top of mind for obvious reasons ...
Thanks, -John On Jan 8, 2013, at 5:59 PM, Edison Su <edison...@citrix.com> wrote: > > >> -----Original Message----- >> From: John Burwell [mailto:jburw...@basho.com] >> Sent: Tuesday, January 08, 2013 10:59 AM >> To: cloudstack-dev@incubator.apache.org >> Subject: Re: new storage framework update >> >> Edison, >> >> In reviewing the javelin, I feel that there is a missing abstraction. At the >> lowest level, storage operations are the storage, retrieval, deletion, and >> listing of byte arrays stored at a particular URI. In order to implement >> this >> concept in the current Javelin branch, 3-5 strategy classes must implemented >> to perform the following low-level operations: >> >> * open(URI aDestinationURI): OutputStream throws IOException >> * write(URI aDestinationURI, OutputStream anOutputStream) throws >> IOException >> * list(URI aDestinationURI) : Set<URI> throws IOException >> * delete(URI aDestinationURI) : boolean throws IOException >> >> The logic for each of these strategies will be identical which will lead to >> to the >> creation of a support class + glue code (i.e. either individual adapter >> classes I realize that I omitted a couple of definitions in my original email. First, the StorageDevice most likely would be implemented on a domain object that also contained configuration information for a resource. For example, the S3Impl class would also implement StorageDevice. On reflection (and a little pseudo coding), I would also like to refine my original proposed StorageDevice interface: * void read(URI aURI, OutputStream anOutputStream) throws IOException * void write(URI aURI, InputStream anInputStream) throws IOException * Set<URI> list(URI aURI) throws IOException * boolean delete(URI aURI) throws IOException * StorageDeviceType getType() > > If the lowest api is too opaque, like one URI as parameter, I am wondering > it may make the implementation more complicated than it sounds. > For example, there are at least 3 APIs for primary storage driver: > createVolumeFromTemplate, createDataDisk, deleteVolume, and two snapshot > related APIs: createSnapshot, deleteSnapshot. > How to encode above operations into simple write/delete APIs? If one URI > contains too much information, then at the end of day, the receiver side(the > code in hypervisor resource), who is responsible to decode the URI, is > becoming complicated. That's the main reason, I decide to use more specific > APIs instead of one opaque URI. > That's true, if the API is too specific, people needs to implement ton of > APIs(mainly imagedatastoredirver, primarydatastoredriver, > backupdatastoredriver), and all over the place. > Which one is better? People can jump into discuss. > The URI scheme should be a logical, unique, and reversal values associated with the type of resource being stored. For example, the general form of template URIs would "/template/<account_id>/<template_id>/template.properties" and "/template/<account_id>/<template_id>/<uuid>.vhd" . Therefore, for account id 2, template id 200, the template.properties resource would be assigned a URI of "/template/2/200/template.properties. The StorageDevice implementation translates the logical URI to a physical representation. Using S3 as an example, the StorageDevice is configured to use bucket jsb-cloudstack at endpoint s3.amazonaws.com. The S3 storage device would translate the URI to s3://jsb-cloudstack/templates/2/200/template.properties. For an NFS storage device mounted on nfs://localhost/cloudstack, the StorageDevice would translate the logical URI to hfs://localhost/cloudstack/template/<account_id>/<template_id>/template.properties. In short, I believe that we can devise a simple scheme that allows the StorageDevice to treat the URI path relative to its root. To my mind, the createVolumeFromTemplate is decomposable into a series of StorageDevice#read and StorageDevice#write operations which would be issued by the VolumeManager service such as the following: public void createVolumeFromTemplate(Template aTemplate, StorageDevice aTemplateDevice, Volume aVolume, StorageDevice aVolumeDevice) { try { if (aVolumeDevice.getType() != StorageDeviceType.BLOCK || aVolumeDevice.getType() != StorageDeviceType.FILE_SYSTEM) { throw new UnsupportedStorageDeviceException(…); } // Pull the template from template device into a temporary directory final File aTemplateDirectory = new File(<template temp path>) // Non-DRY -- likely a candidate for a TemplateService#downloadTemplate method aTemplateDevice.read(new URI("/templates/<account_id>/<template_id>/template.properties"), new FileOutStream(aTemplateDirectory.createFille("template.properties")); aTemplate.read(new URI("/templates/<account_id>/<template_id>/<template_uuid>.vhd"), new FileOutputStream(aTemplateDirectory.createFile("<template_uuid>.vhd"); // Perform operations with hypervisor as necessary to register storage which yields // anInputStream (possibly a List<InputStream>) aVolumeDevice.write(new URI("/volume/<account_id>/<volume_id>", anInputStream); } catch (IOException e) { // Log and handle the error ... } finally { // Close resources ... } } Dependent on the capabilities of the hypervisor's Java API, the temporary files may not be required, and an OutputStream could copied directly to an InputStream. > >> or a class that implements a ton of interfaces). In addition to this added >> complexity, this segmented approach prevents the implementation of >> common, logical storage features such as ACL enforcement and asset > > This is a good question, how to share the code across multiple components. > For example, one storage can be used as both primary storage and backup > storage. In the current code, developer needs to implement both > primarydataStoredriver and backupdatastoredriver, in order to share code > between these two drivers if needed, I think developer can write one driver > which implements both interfaces. In my opinion, storage drivers classifying their usage limits functionality and composability. Hence, my thought is that the StorageDevice should describe its capabilities -- allowing the various services (e.g. Image, Template, Volume, etc) to determine whether or not the passed storage devices can support the requested operation. > >> encryption. With a common representation of a StorageDevice that operates >> on the standard Java I/O model, we can layer in cross-cutting storage >> operations in a consistent manner. > > I agree that nice to have a standard device model, like the POSIX file system > API in Unix world. But I haven't figure out how to generalized all the > operations on the storage, as I mentioned above. > I can think about, createvolumefromtemplate, can be generalized as link api, > but how about taking snapshot? How about who will handle the difference > between delete voume and delete snapshot, if they are using the same delete > API? The following is an snippet that would be part of the SnapshotService to take a snapshot: // Ask the hypervisor to take a snapshot yields anInputStream (e.g. FileInputStream) aSnapshotDevice.write(new URI("/snapshots/<account_id>/<snapshot_id>), anInputStream) Ultimately, a snapshot can be exported to a single file or OutputStream which can written back out to a StorageDevice. For deleting a snapshot, the following snippet would perform the deletion in the SnapshotService: // Ask the hypervisor to delete the snapshot ... aSnapshotDevice.delete(new URI("/snapshots/<account_id>/<snapshot_id>")) Finally, deleting a volume, the following snippet would delete a volume from VolumeService: // Ask the hypervisor to delete the volume aVolumeDevice.delete(new URI("/volumes/<account_id>/<volume_id>")) In summary, I believe that the opaque operations specified in the StorageDevice interface can accomplish these goals if the following approaches are employed: * Logical, reversible URIs are constructed by the storage services. These URIs are translated by the StorageDevice implementation to the semantics of the underlying device * The storage service methods break their logic down into a series operations against one or more StorageDevices. These operations should conform to common Java idioms because StorageDevice is built on the standard Java I/O model (i.e. InputStream, OutputStream, URI). Thanks, -John > >> >> Based on this line of thought, I propose the addition of following notions to >> the storage framework: >> >> * StorageType (Enumeration) >> * BLOCK (raw block devices such as iSCSI, NBD, etc) >> * FILE_SYSTEM (devices addressable through the filesystem such as local >> disks, NFS, etc) >> * OBJECT (object stores such as S3 and Swift) >> * StorageDevice (interface) >> * open(URI aDestinationURI): OutputStream throws IOException >> * write(URI aDestinationURI, OutputStream anOutputStream) throws >> IOException >> * list(URI aDestinationURI) : Set<URI> throws IOException >> * delete(URI aDestinationURI) : boolean throws IOException >> * getType() : StorageType >> * UnsupportedStorageDevice (unchecked exception): Thrown when an >> unsuitable device type is provided to a storage service. >> >> All operations on the higher level storage services (e.g. ImageService) would >> accept a StorageDevice parameter on their operations. Using the type >> property, services can determine whether or not the passed device is an >> suitable (e.g. guarding against the use object store such as S3 as VM disk) >> -- >> throwing an UnsupportedStorageDevice exception when a device unsuitable >> for the requested operation. The services would then perform all storage >> operations through the passed StorageDevice. >> >> One potential gap is security. I do not know whether or not authorization >> decisions are assumed to occur up the stack from the storage engine or as >> part of it. >> >> Thanks, >> -John >> >> P.S. I apologize for taking so long to push my feedback. I am just getting >> back >> on station from the birth of our second child. > > > Congratulation! Thanks for your great feedback. > >> >> On Dec 28, 2012, at 8:09 PM, Edison Su <edison...@citrix.com> wrote: >> >>> >>> >>>> -----Original Message----- >>>> From: Marcus Sorensen [mailto:shadow...@gmail.com] >>>> Sent: Friday, December 28, 2012 2:56 PM >>>> To: cloudstack-dev@incubator.apache.org >>>> Subject: Re: new storage framework update >>>> >>>> Thanks. I'm trying to picture how this will change the existing code. >>>> I think it is something i will need a real example to understand. >>>> Currently we pass a >>> Yah, the example code is in these files: >>> XenNfsConfigurator >>> DefaultPrimaryDataStoreDriverImpl >>> DefaultPrimaryDatastoreProviderImpl >>> VolumeServiceImpl >>> DefaultPrimaryDataStore >>> XenServerStorageResource >>> >>> You can start from volumeServiceTest -> createVolumeFromTemplate test >> case. >>> >>>> storageFilerTO and/or volumeTO from the serverto the agent, and the >>>> agent >>> These model is not changed, what changed are the commands send to >> resource. Right now, each storage protocol can send it's own command to >> resource. >>> All the storage related commands are put under >> org.apache.cloudstack.storage.command package. Take >> CopyTemplateToPrimaryStorageCmd as an example, >>> It has a field called ImageOnPrimayDataStoreTO, which contains a >> PrimaryDataStoreTO. PrimaryDataStoreTO contains the basic information >> about a primary storage. If needs to send extra information to resource, one >> can subclass PrimaryDataStoreTO, e.g. NfsPrimaryDataStoreTO, which >> contains nfs server ip, and nfs path. In this way, one can write a >> CLVMPrimaryDataStoreTO, which contains clvm's own special information if >> needed. Different protocol uses different TO can simply the code, and >> easier to add new storage. >>> >>>> does all of the work. Do we still need things like >>>> LibvirtStorageAdaptor to do the work on the agent side of actually >>>> managing the volumes/pools and implementing them, connecting them >> to >>>> vms? So in implementing new storage we will need to write both a >>>> configurator and potentially a storage adaptor? >>> >>> Yes, that's minimal requirements. >>> >>>> On Dec 27, 2012 6:41 PM, "Edison Su" <edison...@citrix.com> wrote: >>>> >>>>> Hi All, >>>>> Before heading into holiday, I'd like to update the current >>>>> status of the new storage framework since last collab12. >>>>> 1. Class diagram of primary storage is evolved: >>>>> >>>> >> https://cwiki.apache.org/confluence/download/attachments/30741569/sto >>>> r >>>> age.jpg?version=1&modificationDate=1356640617613 >>>>> Highlight the current design: >>>>> a. One storage provider can cover multiple storage >>>>> protocols for multiple hypervisors. The default storage provider can >>>>> almost cover all the current primary storage protocols. In most of >>>>> cases, you don't need to write a new storage provider, what you need >>>>> to do is to write a new storage configurator. Write a new storage >>>>> provider needs to write a lot of code, which we should avoid it as >>>>> much as >>>> possible. >>>>> b. A new type hierarchy, primaryDataStoreConfigurator, is added. >>>>> The configurator is a factory for primaryDataStore, which assemble >>>>> StorageProtocolTransformer, PrimaryDataStoreLifeCycle and >>>>> PrimaryDataStoreDriver for PrimaryDataStore object, based on the >>>>> hypervisor type and the storage protocol. For example, for nfs >>>>> primary storage on xenserver, there is a class called >>>>> XenNfsConfigurator, which put DefaultXenPrimaryDataStoreLifeCycle, >>>>> NfsProtocolTransformer and DefaultPrimaryDataStoreDriverImpl into >>>>> DefaultPrimaryDataStore. One provider can only have one configurator >>>>> for a pair of hypervisor type and storage protocol. For example, if >>>>> you want to add a new nfs protocol configurator for xenserver >>>>> hypervisor, you need to write a new storage provider. >>>>> c. A new interface, StorageProtocolTransformer, is added. The >>>>> main purpose of this interface is to handle the difference between >>>>> different storage protocols. It has four methods: >>>>> getInputParamNames: return a list of name of parameters >>>>> for a particular protocol. E.g. NFS protocol has ["server", "path"], >>>>> ISCSI has ["iqn", "lun"] etc. UI shouldn't hardcode these parameters >>>>> any >>>> more. >>>>> normalizeUserInput: given a user input from UI/API, need >>>>> to validate the input, and break it apart, then store them into database >>>>> getDataStoreTO/ getVolumeTO: each protocol can have its >>>>> own volumeTO and primaryStorageTO. TO is the object will be passed >>>>> down to resource, if your storage has extra information you want to >>>>> pass to resource, these two methods are the place you can override. >>>>> d. All the time-consuming API calls related to storage is async. >>>>> >>>>> 2. Minimal functionalities are implemented: >>>>> a. Can register a http template, without SSVM >>>>> b. Can register a NFS primary storage for xenserver >>>>> c. Can download a template into primary storage directly >>>>> d. Can create a volume from a template >>>>> >>>>> 3. All about test: >>>>> a. TestNG test framework is used, as it can provide >>>>> parameter for each test case. For integration test, we need to know >>>>> ip address of hypervisor host, the host uuid(if it's xenserver), the >>>>> primary storage url, the template url etc. These configurations are >>>>> better to be parameterized, so for each test run, we don't need to >>>>> modify test case itself, instead, we provide a test configuration >>>>> file for each test run. TestNG framework already has this >>>>> functionality, I just >>>> reuse it. >>>>> b. Every pieces of code can be unit tested, which means: >>>>> b.1 the xcp plugin can be unit tested. I wrote a >>>>> small python code, called mockxcpplugin.py, which can directly call >>>>> xcp >>>> plugin. >>>>> b.2 direct agent hypervisor resource can be tested. >>>>> I wrote a mock agent manger, which can load and initialize >>>>> hypervisor resource, and also can send command to resource. >>>>> b.3 a storage integration test maven project is >>>>> created, which can test the whole storage subsystem, such as create >>>>> volume from template, which including both image and volume >>>> components. >>>>> A new section, called "how to test", is added into >>>>> >>>> >> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+subsys >>>> t >>>>> em+2.0, >>>>> please check it out. >>>>> >>>>> The code is on the javelin branch, the maven projects whose >>>>> name starting from cloud-engine-storage-* are the code related to >>>>> storage subsystem. Most of the primary storage code is in >>>>> cloud-engine-storage-volume project. >>>>> Any feedback/comment is appreciated. >>>>> >