Hi all, Is it possible to use the Cassandra ColumnFamilyInputFormat in combination with the Hadoop "streaming" job? Within the Hadoop docs it says that you can specify other plugins, eg:
-inputformat JavaClassName http://hadoop.apache.org/common/docs/r0.15.2/streaming.html#Specifying+Other+Plugins+for+Jobs However it then says: "The class you supply for the input format should return key/value pairs of Text class." Whereas the Cassandra Wiki says: "Cassandra rows or row fragments (that is, pairs of key + SortedMap of columns) are input to Map tasks for processing by your job" http://wiki.apache.org/cassandra/HadoopSupport So I'm wondering if this would work or if it's just never going to happen. I guess the alternative is to write a Hadoop Java class for the job, but this is what I'm trying to avoid. Has anyone got any examples of getting M/R working with Cassandra as input source? Thanks Dave