Hi everyone, I couple of months ago I started working on a new Hadoop InputFormat that we needed for something at my work. It is in a semi-working state now so I thought I would post a link in case anyone is interested:
https://github.com/wibiclint/cassandra2-hadoop2 At the time I started working on this, I wanted a driver that used the DataStax Java driver. I saw that the most recent Cassandra release uses the Java driver to Hadoop integration, so that somewhat obviates what I was doing, but it has a couple of other features that folks might find useful: * You can combine the results of multiple queries to multiple tables * You can group rows based on the partition key, plus some optional subset of the clustering columns Thanks to the folks on the list who helped answer my questions while I was writing this. Best regards, Clint