On Mon, May 17, 2010 at 3:46 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > Moving to the user@ list. > > http://wiki.apache.org/cassandra/HadoopSupport should be useful.
That document doesn't really answer the "is data locality preserved" when running the map phase, but my hunch is "no". > > On Mon, May 17, 2010 at 2:41 PM, Yan Virin <jan.vi...@gmail.com> wrote: >> Hi, >> Can someone explain how this works? As long as I know, there is no execution >> engine in Cassandra alone, so I assume that Hadoop gives the MapReduce >> execution engine which uses Cassandra as the distributed storage? Is data >> locality preserved? How mature this "couple" is? How is the performance of >> this compared to the original Hadoop over HDFS? The built-in execution engine is one thing that excites me about the Riak data store -- the work is done locally to where the data is. That and you can specify your jobs in javascript, making it that much easier for web-oriented people :-) The big drawback for Riak is that building it for FreeBSD is pretty much impossible.