If you used that snippet of code, all connections would go through the same seed: the input code does additional work to determine which nodes are holding particular key ranges, and then connects directly.
---- For outputting from Hadoop to Cassandra, you may want to consider using a Java client like Hector, which will handle the load balancing for you. http://github.com/rantav/hector Thanks, Stu -----Original Message----- From: "Sonny Heer" <sonnyh...@gmail.com> Sent: Monday, April 19, 2010 11:29am To: cassandra-u...@incubator.apache.org Subject: Map/Reduce Cassandra Output Different from the wordcount my input source is a directory, and I have the a split class and record reader defined. Different from wordcount during reduce I need to insert into Cassandra. I notice for the wordcount input it retrieves a handle on a cassandra client like this: TSocket socket = new TSocket(DatabaseDescriptor.getSeeds().iterator().next().getHostAddress(), DatabaseDescriptor.getThriftPort()); TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket, false, false); Cassandra.Client client = new Cassandra.Client(binaryProtocol); Would all hadoop nodes go to the same seed if i use this code to insert data, without balancing it? Has this been done somewhere in the Cassandra code already?