10s is just not a long timeout for hadoop map tasks. you should increase it. you may also want to experiment with running less simultaneous map tasks per tracker.
On Tue, Jun 15, 2010 at 11:00 AM, Drew Dahlke <drew.dah...@bronto.com> wrote: > Hi, I'm running cassandra .6.2 on a dedicated 4 node cluster and I > also have a dedicated 4 node hadoop cluster. I'm trying to run a > simple map reduce job against a single column family and it only takes > 32 map tasks before I get floods of thrift timeouts. That would make > sense to me if the cassandra was stressing the hardware or the > network, but it's not. Each box has 8 cores/16G ram. During the job > CPU averages 150-250% (1/5 utilization on 8 cores), network IO hovers > around 15% throughput, iostat < 15%. > > The hadoop machines are taking even less of a beating. The simpler I > make the job, the faster it hits cassandra, the faster it throws > timeouts & vice versa. I'm guessing there's a software/config related > bottleneck I'm hitting well before tapping out the hardware. Any idea > what that might be? > > java.lang.RuntimeException: TimedOutException() > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:174) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:224) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:101) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:95) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) > at > org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > Caused by: TimedOutException() > at > org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11015) > at > org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:623) > at > org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:597) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:151) > ... 11 more > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com