If you used that snippet of code, all connections would go through the same 
seed: the input code does additional work to determine which nodes are holding 
particular key ranges, and then connects directly.


For outputting from Hadoop to Cassandra, you may want to consider using a Java 
client like Hector, which will handle the load balancing for you.



-----Original Message-----
From: "Sonny Heer" <sonnyh...@gmail.com>
Sent: Monday, April 19, 2010 11:29am
To: cassandra-u...@incubator.apache.org
Subject: Map/Reduce Cassandra Output

Different from the wordcount my input source is a directory, and I
have the a split class and record reader defined.

Different from wordcount during reduce I need to insert into
Cassandra.  I notice for the wordcount input it retrieves a handle on
a cassandra client like this:

        TSocket socket = new
        TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket,
false, false);
        Cassandra.Client client = new Cassandra.Client(binaryProtocol);

Would all hadoop nodes go to the same seed if i use this code to
insert data, without balancing it?  Has this been done somewhere in
the Cassandra code already?

Reply via email to