This may be totally missing the mark but I've been reading up on ways to do fast iterative processing in Storm or Spark/shark, with the ultimate goal of results ending up in Riak for fast multi-key retrieval.
I want this setup to be as lean as possible for obvious reasons so I've started to look more closely at the possible Riak CS / Spark combo. Apparently, please correct if wrong, Riak CS sits on top of Riak and is S3-api compliant. Underlying the db for the objects is levelDB (which would have been my choice anyway, bc of the low in-mem key overhead) Apparently Bitcask is also used, although it's not clear to me what for exactly. At the same time Spark (with Shark on top, which is what Hive is for Hadoop if that in any way makes things clearer) can use HDFS or S3 as it's so called 'deep store'. Combining this it seems, Riak CS and Spark/Shark could be a nice pretty tight combo providing interative and adhoc quering through Shark + all the excellent stuff of Riak through the S3 protocol which they both speak . Is this correct? Would I loose any of the raw power of Riak when going with Riak CS? Anyone ever tried this combo? Thanks, Geert-Jan -- View this message in context: http://riak-users.197444.n3.nabble.com/combining-Riak-CS-and-Spark-shark-by-speaking-over-s3-protocol-tp4028621.html Sent from the Riak Users mailing list archive at Nabble.com. _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com