Re: Map/Reduce Cassandra Output

Sonny Heer Mon, 19 Apr 2010 14:38:42 -0700

Thanks Stu.  I will take a look at Hector.  Do you know where the
input code does the additional work?




On Mon, Apr 19, 2010 at 11:20 AM, Stu Hood <stu.h...@rackspace.com> wrote:
> If you used that snippet of code, all connections would go through the same 
> seed: the input code does additional work to determine which nodes are 
> holding particular key ranges, and then connects directly.
>
> ----
>
> For outputting from Hadoop to Cassandra, you may want to consider using a 
> Java client like Hector, which will handle the load balancing for you.
>
> http://github.com/rantav/hector
>
> Thanks,
> Stu
>
> -----Original Message-----
> From: "Sonny Heer" <sonnyh...@gmail.com>
> Sent: Monday, April 19, 2010 11:29am
> To: cassandra-u...@incubator.apache.org
> Subject: Map/Reduce Cassandra Output
>
> Different from the wordcount my input source is a directory, and I
> have the a split class and record reader defined.
>
> Different from wordcount during reduce I need to insert into
> Cassandra.  I notice for the wordcount input it retrieves a handle on
> a cassandra client like this:
>
>        TSocket socket = new
> TSocket(DatabaseDescriptor.getSeeds().iterator().next().getHostAddress(),
>                                     DatabaseDescriptor.getThriftPort());
>        TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket,
> false, false);
>        Cassandra.Client client = new Cassandra.Client(binaryProtocol);
>
> Would all hadoop nodes go to the same seed if i use this code to
> insert data, without balancing it?  Has this been done somewhere in
> the Cassandra code already?
>
>
>

Re: Map/Reduce Cassandra Output

Reply via email to