The easy answer is "use something like Pig or Hive that does these joins for you under the hood."
Not actually sure what the hard answer is. :) On Fri, Jul 15, 2011 at 1:34 AM, Markus Mock <markus.m...@gmail.com> wrote: > Hello, > with org.apache.cassandra.hadoop.ConfigHelper.setInputColumnFamily I can set > up the map phase to read from one column family. Is it possible to have > multiple mapper classes each mapping over their own column family so that > data from multiple column families can be "joined" in the reduce phase? I > didn't find any documentation on how to do that. > One workaround I see is to do several MRs write the data from the different > column families in a single helper column family and then do the desired > computation but I am trying to avoid that if possible. Any suggestions on > how to do this without running multiple MRs and instead read from multiple > column families in one go? > Thanks. > -- Markus > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com