Thanks Stu. I will take a look at Hector. Do you know where the input code does the additional work?
On Mon, Apr 19, 2010 at 11:20 AM, Stu Hood <stu.h...@rackspace.com> wrote: > If you used that snippet of code, all connections would go through the same > seed: the input code does additional work to determine which nodes are > holding particular key ranges, and then connects directly. > > ---- > > For outputting from Hadoop to Cassandra, you may want to consider using a > Java client like Hector, which will handle the load balancing for you. > > http://github.com/rantav/hector > > Thanks, > Stu > > -----Original Message----- > From: "Sonny Heer" <sonnyh...@gmail.com> > Sent: Monday, April 19, 2010 11:29am > To: cassandra-u...@incubator.apache.org > Subject: Map/Reduce Cassandra Output > > Different from the wordcount my input source is a directory, and I > have the a split class and record reader defined. > > Different from wordcount during reduce I need to insert into > Cassandra. I notice for the wordcount input it retrieves a handle on > a cassandra client like this: > > TSocket socket = new > TSocket(DatabaseDescriptor.getSeeds().iterator().next().getHostAddress(), > DatabaseDescriptor.getThriftPort()); > TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket, > false, false); > Cassandra.Client client = new Cassandra.Client(binaryProtocol); > > Would all hadoop nodes go to the same seed if i use this code to > insert data, without balancing it? Has this been done somewhere in > the Cassandra code already? > > >