Re: Timeout Errors while running Hadoop over Cassandra

2011-01-19 Thread Jairam Chandar
I was able to workaround this problem by modifying the ColumnFamilyRecordReader class from the org.apache.cassandra.hadoop package. Since the errors where TimeoutException, I added sleep and retry logic around rows = client.get_range_slices(keyspace, new ColumnParent(cfName), predicate,

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-14 Thread Jairam Chandar
The cassandra logs strangely show no errors at the time of failure. Changing the RPCTimeoutInMillis seemed to help. Though it slowed down the job considerably, it seems to be finishing by changing the timeout value to 1 min. Unfortunately, I cannot be sure if it will continue to work if the data in

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-13 Thread Jeremy Hanna
On Jan 12, 2011, at 12:40 PM, Jairam Chandar wrote: > Hi folks, > > We have a Cassandra 0.6.6 cluster running in production. We want to run > Hadoop (version 0.20.2) jobs over this cluster in order to generate reports. > I modified the word_count example in the contrib folder of the cassandra

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 23:04 +0100, mck wrote: > > Caused by: TimedOutException() > > What is the exception in the cassandra logs? Or tried increasing rpc_timeout_in_ms? ~mck -- "When there is no enemy within, the enemies outside can't hurt you." African proverb | www.semb.wever.org | www.sesa

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 18:40 +, Jairam Chandar wrote: > Caused by: TimedOutException() What is the exception in the cassandra logs? ~mck -- "Don't use Outlook. Outlook is really just a security hole with a small e-mail client attached to it." Brian Trosko | www.semb.wever.org | www.sesat.no

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread Aaron Morton
Whats happening in the cassandra server logs when you get these errors? Reading through the hadoop 0.6.6 code it looks like it creates a thrift client with an infinite timeout. So it may be an internode timeout, which is set in storage-conf.xml.AaronOn 13 Jan, 2011,at 07:40 AM, Jairam Chandar wrot

Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread Jairam Chandar
Hi folks, We have a Cassandra 0.6.6 cluster running in production. We want to run Hadoop (version 0.20.2) jobs over this cluster in order to generate reports. I modified the word_count example in the contrib folder of the cassandra distribution. While the program is running fine for small datasets