There is some information on the wiki http://wiki.apache.org/cassandra/HadoopSupport about a resource leak before 0.6.2 versions that can result in a TimeoutException. But you're on 0.6.5 so should be ok.
I had a quick look at the Hadoop code and could not see where to change the timeout (that would be the obvious thing to try). If you have a look in the ConfigHelper.java though it says /** * The number of rows to request with each get range slices request. * Too big and you can either get timeouts when it takes Cassandra too * long to fetch all the data. Too small and the performance * will be eaten up by the overhead of each request. * * @param conf Job configuration you are about to run * @param batchsize Number of rows to request each time */ public static void setRangeBatchSize(Configuration conf, int batchsize) { conf.setInt(RANGE_BATCH_SIZE_CONFIG, batchsize); } The config item name is ""cassandra.range.batch.size". Try reducing the batch size first and see if the timeouts go away. Though it does not sound like you have a lot of data. An 0.7 beta2 may be out this week. But it's still beta. Hope that helps. Aaron On 25 Sep 2010, at 07:17, Saket Joshi wrote: > Hi Experts, > > I need help on an exception integrating cassandra-hadoop. I am getting the > following exception, when running a Hadoop Map reduce job > http://pastebin.com/RktaqDnj > I am using cassandra 0.6.5 , 3 node cluster. I don’t get any exception when > the data I am processing is very small < 5 rows and 100 columns, but get > the error with modest data is > 5 rows 500 columns. I went though some of the > forums where people have experienced the same issue. > http://www.listware.net/201005/cassandra-user/21897-timeout-while-running-simple-hadoop-job.html > . Is this a bug with Cassandra-hadoop classes and is that fixed in 0.7 for > sure? how stable is 0.7 beta ? In the system.log I see a lot of ” index has > reached its threshold; switching in a fresh Memtable” messages > > Has Anyone faced a similar issue and solved it? Is migrating to 0.7 the only > solution? > > Thanks, > Saket > > Stack Trace of the Exception: > {ava.lang.RuntimeException: TimedOutException() > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:186) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:236) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:104) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:98) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) > at > org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > Caused by: TimedOutException() > at > org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11094) > at > org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:628) > at > org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:602) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:164)}