Hello, Anyone interested in doing map/reduce on Cassandra data should take a look at Cassandra Storage Handler for Hive. Storage handlers give Hive the ability to work with data outside HDFS in a more natural way. Support is now in place for reading and writing to/from Standard Column Families (no super column support yet). While this allows users to use an SQL like language on their Cassandra data, it does NOT do things like push down of a where clause into sub-second queries.
https://issues.apache.org/jira/browse/HIVE-1434 For those looking to try this out with minimal effort, I have a tar bundle with cassandra, hive, and hadoop here: http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/test_drive_hive_cassandra_integration ::Warning:: The bundle is a pre-release build of Hive with cassandra support. Treat it as such. Enjoy, Edward