I changed the rpc_timeout_in_ms to 30000 and 40000, then changed the *cassandra.range.batch.size from 4096 to 1024, but still 40% tasks got timeout exceptions. * Not sure if this is caused by Cassandra speed performance (8G heap size for about 100G of data) or the way how the Cassandra-hadoop integration is implemented. I rarely saw any timeout exceptions when I use hector to get back data.
Thanks, John On Thu, Jul 28, 2011 at 12:45 PM, Jian Fang <jian.fang.subscr...@gmail.com>wrote: > > My current setting is 10000. I will try 30000. > > Thanks, > > John > > On Thu, Jul 28, 2011 at 12:39 PM, Jeremy Hanna <jeremy.hanna1...@gmail.com > > wrote: > >> See http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting - I >> would probably start with setting your rpc_timeout_in_ms to something like >> 30000. >> >> On Jul 28, 2011, at 11:09 AM, Jian Fang wrote: >> >> > Hi, >> > >> > I run Cassandra 0.8.2 and hadoop 0.20.2 on three nodes, each node >> includes a Cassandra instance and a hadoop data node. >> > I created a simple hadoop job to scan a Cassandra column value in a >> column family and write it to a file system if it meets some conditions. >> > I keep getting the following timeout exceptions. Is this caused by my >> settings in Cassandra? Or how could I change the timeout value on the >> > Cassandra Hadoop API to get around this problem? >> > >> > >> > 11/07/28 12:02:47 INFO mapred.JobClient: Task Id : >> attempt_201107281151_0001_m_000052_0, Status : FAILED >> > java.lang.RuntimeException: TimedOutException() >> > at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:265) >> > at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:279) >> > at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:177) >> > at >> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) >> > at >> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) >> > at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:136) >> > at >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) >> > at >> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) >> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> > at org.apache.hadoop.mapred.Child.main(Child.java:170) >> > Caused by: TimedOutException() >> > at >> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12590) >> > at >> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:762) >> > at >> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:734) >> > at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:243) >> > ... 11 more >> > >> > Thanks in advance, >> > >> > John >> >> >