[Sorry for cross posting, but potentially both users and developers may be interested]
The s3-hdfs branch in git.cloud.com [1] replaces the filesystem backend with a HDFS backend. Since we're still discussing where development happens, I haven't moved it to anywhere official. For those looking to integrate other storage backends, it might provide a useful guide. The steps to get this working are the same, except that the storage.root in step 4 below is as follows: storage.root=hdfs://<hdfs namenode>:9000/s3 On the HDFS side, I used Hadoop 0.23.1 (although I think it should work with Hadoop 1.0). I found that I had to set the following configuration in hdfs-site.xml <property> <name>dfs.permissions</name> <value>false</value> </property> [1] http://git.cloud.com/cgit/cloudstack-oss/log/?h=s3-hdfs On 5/30/12 2:59 PM, "Chiradeep Vittal" <chiradeep.vit...@citrix.com> wrote: >The S3 API formerly in the "Cloud Bridge" source has been moved to >CloudStack. >Several bugs have been fixed and more features have been supported. >The complete list of supported APIs can be found here: >S3 API Page: >http://wiki.cloudstack.org/display/RelOps/S3+API+in+CloudStack > >There has not been any formal QA applied to this feature, but I hope we >can move it to a more polished state. >For that, I hope the community can contribute some testing and bug fixing >resources. Instructions on using this at the end of this post. > >The code has been tested using boto (https://github.com/boto/boto), but I >suspect that there are other clients (AWS SDK, s3cmd,etc) that are also >popular. The error responses for S3 are not well documented, so some of >these clients might barf unexpectedly. > > >*****Usage********** >1. Get CloudStack running on the latest 3.0.x series > >2. Enable the S3 API by setting the flag enable.s3.api to 'true' in the >configuration table. You can do this through the UI or directly in MySQL: >update configuration set value='true' where name='enable.s3.api'; > >3. Choose a local filesystem path where the objects will be stored. You >can mount an NFS store or use the local filesystem. E.g,: >mkdir -p /mnt/s3 >Ensure that the 'cloud' user can write to this directory > >4. Edit the file $TOMCAT_HOME/conf/cloud-bridge.properties: > >host=http://localhost:8080/awsapi >storage.multipartDir=__multipart__uploads__ >bucket.dns=false >storage.root=<mount point or filesystem path> >serviceEndpoint=localhost:8080 > > >5. Restart CloudStack > >6. Obtain API and secret keys for a user (available in the Admin ui under >Accounts -> Users) >CloudStack Api key = this is the same as the AWS access key id >CloudStack Secret key = this is the same as the AWS secret access key >Generate a private key and a self-signed X.509 certificate. Substitute >your own desired storage location for /path/to/Š below. >$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 <http://rsa:2048/> >-keyout /path/to/private_key.pem -out /path/to/cert.pem > >Register the mapping from the X.509 certificate to your accounts API keys >with CloudStack. >$ cloudstack-aws-api-register --apikey=<User¹s Cloudstack API key> >--secretkey=<User¹s CloudStack Secret key> --cert=</path/to/cert.pem> >--url=http://<cloud-stack-server>:8080/awsapi/rest/AmazonS3 > > >The cloudstack-aws-api-register command is available in /usr/bin in the >machine where CloudStack is installed > >7. Configure the boto S3Connection object as follows: >calling_format= OrdinaryCallingFormat() >connection = S3Connection(aws_access_key_id=<your api key>, > aws_secret_access_key=<your secret key> > is_secure=False, > host='<cloudstack-server>', > port=8080, >calling_format=calling_format, > path="/awsapi/rest/AmazonS3") > > >