> > If anyone has "war stories" on the topic of Cassandra & Hadoop (or > even just Hadoop in general) let me know.
Don't know if it counts as a war story, but I was successful recently in implementing something I got advice on in an earlier thread, namely feeding both a Cassandra table and a Hadoop sequence file into the same map/reduce process and updating the same Cassandra table with the results. I used the approach I mentioned before, of creating an InputFormat that returns splits from both (and creating a RecordReader that massages the Cass data into the same format as the sequence file data). I'll write something up about it for the wiki, when I can find some time. My chief concern with it, though, is gracefully handling a map/reduce failure. As Cassandra isn't transactional, the table may end up partially updated, which is a problem, at least in the domain I'm working in. So now I'm trying to come up with a way to effect Cassandra transactions via column naming conventions or indexes or something like that. I'd be curious to hear if anyone here has ever implemented a solution for something similar before... Thanks Mark