Just wondering - what consistency level are you using for hadoop reads? Also, do you have task trackers running on the cassandra nodes so that reads will be local?
On Jul 28, 2011, at 2:46 PM, Jian Fang wrote: > I changed the rpc_timeout_in_ms to 30000 and 40000, then changed the > cassandra.range.batch.size from 4096 to 1024, > but still 40% tasks got timeout exceptions. > > Not sure if this is caused by Cassandra speed performance (8G heap size for > about 100G of data) or the way how the Cassandra-hadoop integration > is implemented. I rarely saw any timeout exceptions when I use hector to get > back data. > > Thanks, > > John > > On Thu, Jul 28, 2011 at 12:45 PM, Jian Fang <jian.fang.subscr...@gmail.com> > wrote: > > My current setting is 10000. I will try 30000. > > Thanks, > > John > > On Thu, Jul 28, 2011 at 12:39 PM, Jeremy Hanna <jeremy.hanna1...@gmail.com> > wrote: > See http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting - I would > probably start with setting your rpc_timeout_in_ms to something like 30000. > > On Jul 28, 2011, at 11:09 AM, Jian Fang wrote: > > > Hi, > > > > I run Cassandra 0.8.2 and hadoop 0.20.2 on three nodes, each node includes > > a Cassandra instance and a hadoop data node. > > I created a simple hadoop job to scan a Cassandra column value in a column > > family and write it to a file system if it meets some conditions. > > I keep getting the following timeout exceptions. Is this caused by my > > settings in Cassandra? Or how could I change the timeout value on the > > Cassandra Hadoop API to get around this problem? > > > > > > 11/07/28 12:02:47 INFO mapred.JobClient: Task Id : > > attempt_201107281151_0001_m_000052_0, Status : FAILED > > java.lang.RuntimeException: TimedOutException() > > at > > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:265) > > at > > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:279) > > at > > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:177) > > at > > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) > > at > > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) > > at > > org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:136) > > at > > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) > > at > > org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > Caused by: TimedOutException() > > at > > org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12590) > > at > > org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:762) > > at > > org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:734) > > at > > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:243) > > ... 11 more > > > > Thanks in advance, > > > > John > > >