On Tue, May 25, 2010 at 6:35 PM, Jeremy Hanna <jeremy.hanna1...@gmail.com> wrote:
> What is the use case? we end up with messed up data in the database, we run a mapreduce job to find irregular data from time to time. > Why are you using Cassandra versus using data stored in HDFS or HBase? as of now our mapreduce task is only used for "fixing" cassandra so the question is useless :) > Are you using a separate Hadoop cluster to run the MR jobs on, or perhaps are > you running the Job Tracker and Task Trackers on Cassandra nodes? separate > Is there anything holding you back from using it (if you would like to use it > but currently cannot)? It would be nice if the output of the mapreduce job was a MutationOutputFormat in which we could write insert/delete, but I recall there is something on jira already albeit not sure if it was merged.