Mohan, I might be a bit confused on what your intent is, but it sounds like your task is to download a large group of files from S3 for processing and you are considering Riak CS for that processing work. If that is the case I am not sure Riak CS is the right fit for that job. Riak itself has a map reduce system built into it, but that is not exposed by Riak CS. Currently, Riak CS is strictly for storage and not data processing. If that is your need you might be better off looking at tools built for data processing like Spark or Hadoop.
Kelly On July 11, 2014 at 10:13:35 PM, Mohan Radhakrishnan (radhakrishnan.mo...@gmail.com) wrote: I thought the general idea floating in my ignorant mind can help. We have other storage systems apart from S3 like FTPS, HTTPS etc. I thought the task of crawling remote storage systems and processing files naturally lent itself to distributed MR jobs and a DFS. That is when I came across Riak CS. Thanks, Mohan On Sat, Jul 12, 2014 at 9:32 AM, Mohan Radhakrishnan <radhakrishnan.mo...@gmail.com> wrote: Hi, I came across the storage system discussion thread. We have a requirement to download thousands of files from S3 for processing. Ours is not a cloud storage system but a cloud access system. Are there qualities of Riak CS that can help us ? We want to download part of a huge file and checkpoint if the connection breaks. File downloads should be fault-tolerant when nodes go down. Please bear with me if my question is basic. I don't work with distributed cloud storage systems. Thanks, Mohan _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com