Re: new storage framework update

John Burwell Tue, 08 Jan 2013 20:51:03 -0800

Edison,

Please see my thoughts in-line below.  I apologize for S3-centric nature of my 
example in advance -- it happens to be top of mind for obvious reasons ...


Thanks,
-John

On Jan 8, 2013, at 5:59 PM, Edison Su <edison...@citrix.com> wrote:

> 
> 
>> -----Original Message-----
>> From: John Burwell [mailto:jburw...@basho.com]
>> Sent: Tuesday, January 08, 2013 10:59 AM
>> To: cloudstack-dev@incubator.apache.org
>> Subject: Re: new storage framework update
>> 
>> Edison,
>> 
>> In reviewing the javelin, I feel that there is a missing abstraction.  At the
>> lowest level, storage operations are the storage, retrieval, deletion, and
>> listing of byte arrays stored at a particular URI.  In order to implement 
>> this
>> concept in the current Javelin branch, 3-5 strategy classes must implemented
>> to perform the following low-level operations:
>> 
>>   * open(URI aDestinationURI): OutputStream throws IOException
>>   * write(URI aDestinationURI, OutputStream anOutputStream) throws
>> IOException
>>   * list(URI aDestinationURI) : Set<URI> throws IOException
>>   * delete(URI aDestinationURI) : boolean throws IOException
>> 
>> The logic for each of these strategies will be identical which will lead to 
>> to the
>> creation of a support class + glue code (i.e. either individual adapter 
>> classes

I realize that I omitted a couple of definitions in my original email.  First, 
the StorageDevice most likely would be implemented on a domain object that also 
contained configuration information for a resource.  For example, the S3Impl 
class would also implement StorageDevice.  On reflection (and a little pseudo 
coding), I would also like to refine my original proposed StorageDevice 
interface:

   * void read(URI aURI, OutputStream anOutputStream) throws IOException
   * void write(URI aURI, InputStream anInputStream)  throws IOException
   * Set<URI> list(URI aURI)  throws IOException
   * boolean delete(URI aURI) throws IOException
   * StorageDeviceType getType()

> 
> If the lowest api is too opaque, like one URI as parameter,  I am wondering 
> it may make the implementation more complicated than it sounds.
> For example, there are at least 3 APIs for primary storage driver: 
> createVolumeFromTemplate, createDataDisk, deleteVolume, and two snapshot 
> related APIs: createSnapshot, deleteSnapshot. 
> How to encode above operations into simple write/delete APIs? If one URI 
> contains too much information, then at the end of day, the receiver side(the 
> code in hypervisor resource), who is responsible to decode the URI, is 
> becoming complicated.  That's the main reason, I decide to use more specific 
> APIs instead of one opaque URI. 
> That's true, if the API is too specific, people needs to implement ton of 
> APIs(mainly imagedatastoredirver, primarydatastoredriver, 
> backupdatastoredriver), and all over the place. 
> Which one is better? People can jump into discuss.
> 

The URI scheme should be a logical, unique, and reversal values associated with 
the type of resource being stored.  For example, the general form of template 
URIs would "/template/<account_id>/<template_id>/template.properties" and 
"/template/<account_id>/<template_id>/<uuid>.vhd" .  Therefore, for account id 
2, template id 200, the template.properties resource would be assigned a URI of 
"/template/2/200/template.properties.  The StorageDevice implementation 
translates the logical URI to a physical representation.  Using S3 as an 
example, the StorageDevice is configured to use bucket jsb-cloudstack at 
endpoint s3.amazonaws.com.  The S3 storage device would translate the URI to 
s3://jsb-cloudstack/templates/2/200/template.properties.  For an NFS storage 
device mounted on nfs://localhost/cloudstack, the StorageDevice would translate 
the logical URI to 
hfs://localhost/cloudstack/template/<account_id>/<template_id>/template.properties.
  In short, I believe that we can devise a simple scheme that allows the 
StorageDevice to treat the URI path relative to its root.

To my mind, the createVolumeFromTemplate is decomposable into a series of 
StorageDevice#read and StorageDevice#write operations which would be issued by 
the VolumeManager service such as the following:

public void createVolumeFromTemplate(Template aTemplate, StorageDevice 
aTemplateDevice, Volume aVolume, StorageDevice aVolumeDevice) {

try {

if (aVolumeDevice.getType() != StorageDeviceType.BLOCK || 
aVolumeDevice.getType() != StorageDeviceType.FILE_SYSTEM) {
throw new UnsupportedStorageDeviceException(…);
}

// Pull the template from template device into a temporary directory
final File aTemplateDirectory = new File(<template temp path>)

// Non-DRY -- likely a candidate for a TemplateService#downloadTemplate method
aTemplateDevice.read(new 
URI("/templates/<account_id>/<template_id>/template.properties"), new 
FileOutStream(aTemplateDirectory.createFille("template.properties"));
aTemplate.read(new 
URI("/templates/<account_id>/<template_id>/<template_uuid>.vhd"), new 
FileOutputStream(aTemplateDirectory.createFile("<template_uuid>.vhd");

// Perform operations with hypervisor as necessary to register storage which 
yields 
// anInputStream (possibly a List<InputStream>)

aVolumeDevice.write(new URI("/volume/<account_id>/<volume_id>", anInputStream);

} catch (IOException e) {

        // Log and handle the error ...

} finally {

        // Close resources ...

}

}  

Dependent on the capabilities of the hypervisor's Java API, the temporary files 
may not be required, and an OutputStream could copied directly to an 
InputStream.  

> 
>> or a class that implements a ton of interfaces).  In addition to this added
>> complexity, this segmented approach prevents the implementation of
>> common, logical storage features such as ACL enforcement and asset
> 
> This is a good question, how to share the code across multiple components. 
> For example, one storage can be used as both primary storage and backup 
> storage. In the current code, developer needs to implement both 
> primarydataStoredriver and backupdatastoredriver, in order to share code 
> between these two drivers if needed, I think developer can write one driver 
> which implements both interfaces. 

In my opinion, storage drivers classifying their usage limits functionality and 
composability.  Hence, my thought is that the StorageDevice should describe its 
capabilities -- allowing the various services (e.g. Image, Template, Volume, 
etc) to determine whether or not the passed storage devices can support the 
requested operation.  

> 
>> encryption.  With a common representation of a StorageDevice that operates
>> on the standard Java I/O model, we can layer in cross-cutting storage
>> operations in a consistent manner.
> 
> I agree that nice to have a standard device model, like the POSIX file system 
> API in Unix world. But I haven't figure out how to generalized all the 
> operations on the storage, as I mentioned above.
> I can think about, createvolumefromtemplate, can be generalized as link api, 
> but how about taking snapshot? How about who will handle the difference 
> between delete voume and  delete snapshot, if they are using the same delete 
> API?

The following is an snippet that would be part of the SnapshotService to take a 
snapshot:

        // Ask the hypervisor to take a snapshot yields anInputStream (e.g. 
FileInputStream)

        aSnapshotDevice.write(new URI("/snapshots/<account_id>/<snapshot_id>), 
anInputStream)

Ultimately, a snapshot can be exported to a single file or OutputStream which 
can written back out to a StorageDevice.  For deleting a snapshot, the 
following snippet would perform the deletion in the SnapshotService:

        // Ask the hypervisor to delete the snapshot ...

        aSnapshotDevice.delete(new URI("/snapshots/<account_id>/<snapshot_id>"))

Finally, deleting a volume, the following snippet would delete a volume from 
VolumeService:

        // Ask the hypervisor to delete the volume

        aVolumeDevice.delete(new URI("/volumes/<account_id>/<volume_id>"))

In summary, I believe that the opaque operations specified in the StorageDevice 
interface can accomplish these goals if the following approaches are employed:
        
        * Logical, reversible URIs are constructed by the storage services.  
These URIs are translated by the StorageDevice implementation to the semantics 
of the underlying device
        * The storage service methods break their logic down into a series 
operations against one or more StorageDevices.  These operations should conform 
to common Java idioms because StorageDevice is built on the standard Java I/O 
model (i.e. InputStream, OutputStream, URI).

Thanks,
-John

> 
>> 
>> Based on this line of thought, I propose the addition of following notions to
>> the storage framework:
>> 
>>   * StorageType (Enumeration)
>>      * BLOCK (raw block devices such as iSCSI, NBD, etc)
>>      * FILE_SYSTEM (devices addressable through the filesystem such as local
>> disks, NFS, etc)
>>      * OBJECT (object stores such as S3 and Swift)
>>   * StorageDevice (interface)
>>       * open(URI aDestinationURI): OutputStream throws IOException
>>       * write(URI aDestinationURI, OutputStream anOutputStream) throws
>> IOException
>>       * list(URI aDestinationURI) : Set<URI> throws IOException
>>       * delete(URI aDestinationURI) : boolean throws IOException
>>       * getType() : StorageType
>>   * UnsupportedStorageDevice (unchecked exception): Thrown when an
>> unsuitable device type is provided to a storage service.
>> 
>> All operations on the higher level storage services (e.g. ImageService) would
>> accept a StorageDevice parameter on their operations.  Using the type
>> property, services can determine whether or not the passed device is an
>> suitable (e.g. guarding against the use object store such as S3 as VM disk) 
>> --
>> throwing an UnsupportedStorageDevice exception when a device unsuitable
>> for the requested operation.  The services would then perform all storage
>> operations through the passed StorageDevice.
>> 
>> One potential gap is security.  I do not know whether or not authorization
>> decisions are assumed to occur up the stack from the storage engine or as
>> part of it.
>> 
>> Thanks,
>> -John
>> 
>> P.S. I apologize for taking so long to push my feedback.  I am just getting 
>> back
>> on station from the birth of our second child.
> 
> 
> Congratulation! Thanks for your great feedback.
> 
>> 
>> On Dec 28, 2012, at 8:09 PM, Edison Su <edison...@citrix.com> wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Marcus Sorensen [mailto:shadow...@gmail.com]
>>>> Sent: Friday, December 28, 2012 2:56 PM
>>>> To: cloudstack-dev@incubator.apache.org
>>>> Subject: Re: new storage framework update
>>>> 
>>>> Thanks. I'm trying to picture how this will change the existing code.
>>>> I think it is something i will need a real example to understand.
>>>> Currently we pass a
>>> Yah, the example code is in these files:
>>> XenNfsConfigurator
>>> DefaultPrimaryDataStoreDriverImpl
>>> DefaultPrimaryDatastoreProviderImpl
>>> VolumeServiceImpl
>>> DefaultPrimaryDataStore
>>> XenServerStorageResource
>>> 
>>> You can start from volumeServiceTest -> createVolumeFromTemplate test
>> case.
>>> 
>>>> storageFilerTO and/or volumeTO from the serverto the agent, and the
>>>> agent
>>> These model is not changed, what changed are the commands send to
>> resource. Right now, each storage protocol can send it's own command to
>> resource.
>>> All the storage related commands are put under
>> org.apache.cloudstack.storage.command package. Take
>> CopyTemplateToPrimaryStorageCmd as an example,
>>> It has a field called ImageOnPrimayDataStoreTO, which contains a
>> PrimaryDataStoreTO. PrimaryDataStoreTO  contains the basic information
>> about a primary storage. If needs to send extra information to resource, one
>> can subclass PrimaryDataStoreTO, e.g. NfsPrimaryDataStoreTO, which
>> contains nfs server ip, and nfs path. In this way, one can write a
>> CLVMPrimaryDataStoreTO, which contains clvm's own special information if
>> needed.   Different protocol uses different TO can simply the code, and
>> easier to add new storage.
>>> 
>>>> does all of the work. Do we still need things like
>>>> LibvirtStorageAdaptor to do the work on the agent side of actually
>>>> managing the volumes/pools and implementing them, connecting them
>> to
>>>> vms? So in implementing new storage we will need to write both a
>>>> configurator and potentially a storage adaptor?
>>> 
>>> Yes, that's minimal requirements.
>>> 
>>>> On Dec 27, 2012 6:41 PM, "Edison Su" <edison...@citrix.com> wrote:
>>>> 
>>>>> Hi All,
>>>>>    Before heading into holiday, I'd like to update the current
>>>>> status of the new storage framework since last collab12.
>>>>>   1. Class diagram of primary storage is evolved:
>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/download/attachments/30741569/sto
>>>> r
>>>> age.jpg?version=1&modificationDate=1356640617613
>>>>>         Highlight the current design:
>>>>>         a.  One storage provider can cover multiple storage
>>>>> protocols for multiple hypervisors. The default storage provider can
>>>>> almost cover all the current primary storage protocols. In most of
>>>>> cases, you don't need to write a new storage provider, what you need
>>>>> to do is to write a new storage configurator. Write a new storage
>>>>> provider needs to write a lot of code, which we should avoid it as
>>>>> much as
>>>> possible.
>>>>>        b. A new type hierarchy, primaryDataStoreConfigurator, is added.
>>>>> The configurator is a factory for primaryDataStore, which assemble
>>>>> StorageProtocolTransformer, PrimaryDataStoreLifeCycle and
>>>>> PrimaryDataStoreDriver for PrimaryDataStore object, based on the
>>>>> hypervisor type and the storage protocol.  For example, for nfs
>>>>> primary storage on xenserver, there is a class called
>>>>> XenNfsConfigurator, which put DefaultXenPrimaryDataStoreLifeCycle,
>>>>> NfsProtocolTransformer and DefaultPrimaryDataStoreDriverImpl into
>>>>> DefaultPrimaryDataStore. One provider can only have one configurator
>>>>> for a pair of hypervisor type and storage protocol. For example, if
>>>>> you want to add a new nfs protocol configurator for xenserver
>>>>> hypervisor, you need to write a new storage provider.
>>>>>       c. A new interface, StorageProtocolTransformer, is added. The
>>>>> main purpose of this interface is to handle the difference between
>>>>> different storage protocols. It has four methods:
>>>>>            getInputParamNames: return a list of name of parameters
>>>>> for a particular protocol. E.g. NFS protocol has ["server", "path"],
>>>>> ISCSI has ["iqn", "lun"] etc. UI shouldn't hardcode these parameters
>>>>> any
>>>> more.
>>>>>            normalizeUserInput: given a user input from UI/API, need
>>>>> to validate the input, and break it apart, then store them into database
>>>>>            getDataStoreTO/ getVolumeTO: each protocol can have its
>>>>> own volumeTO and primaryStorageTO. TO is the object will be passed
>>>>> down to resource, if your storage has extra information you want to
>>>>> pass to resource, these two methods are the place you can override.
>>>>>       d. All the time-consuming API calls related to storage is async.
>>>>> 
>>>>>      2. Minimal functionalities are implemented:
>>>>>           a. Can register a http template, without SSVM
>>>>>           b. Can register a NFS primary storage for xenserver
>>>>>           c. Can download a template into primary storage directly
>>>>>          d. Can create a volume from a template
>>>>> 
>>>>>      3. All about test:
>>>>>          a. TestNG test framework is used, as it can provide
>>>>> parameter for each test case. For integration test, we need to know
>>>>> ip address of hypervisor host, the host uuid(if it's xenserver), the
>>>>> primary storage url, the template url etc. These configurations are
>>>>> better to be parameterized, so for each test run, we don't need to
>>>>> modify test case itself, instead, we provide a test configuration
>>>>> file for each test run. TestNG framework already has this
>>>>> functionality, I just
>>>> reuse it.
>>>>>          b. Every pieces of code can be unit tested, which means:
>>>>>                b.1 the xcp plugin can be unit tested. I wrote a
>>>>> small python code, called mockxcpplugin.py, which can directly call
>>>>> xcp
>>>> plugin.
>>>>>                b.2 direct agent hypervisor resource can be tested.
>>>>> I wrote a mock agent manger, which can load and initialize
>>>>> hypervisor resource, and also can send command to resource.
>>>>>                b.3 a storage integration test maven project is
>>>>> created, which can test the whole storage subsystem, such as create
>>>>> volume from template, which including both image and volume
>>>> components.
>>>>>          A new section, called "how to test", is added into
>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+subsys
>>>> t
>>>>> em+2.0,
>>>>> please check it out.
>>>>> 
>>>>>     The code is on the javelin branch, the maven projects whose
>>>>> name starting from cloud-engine-storage-* are the code related to
>>>>> storage subsystem. Most of the primary storage code is in
>>>>> cloud-engine-storage-volume project.
>>>>>      Any feedback/comment is appreciated.
>>>>> 
>

Re: new storage framework update

Reply via email to