Hi all, I am using 0.6.8 across 5 machines with ~30G of data on each machine. I am trying to run a map-reduce query (Both with my own Java code and Pig) and failing after about 30 minutes (see stack trace and details below). I have followed this wiki page<http://wiki.apache.org/cassandra/HadoopSupport> to increase ulimit to 32K which seemed to make things work better but I still fails.
Can anyone suggest anything that can help resolve this? Thanks, -Or The Pig query is pretty simple: grunt> rows = LOAD 'cassandra://Indexing/EdgeCache' USING CassandraStorage(); grunt> b = GROUP rows ALL; grunt> c = FOREACH b GENERATE COUNT(rows.$0); grunt> dump c; the Java code is doing something very similar. java.lang.RuntimeException: TimedOutException() at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:186) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:236) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:104) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:98) at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(Unknown Source) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:142) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: TimedOutException() at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11094) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:628) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:602) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:164) ... 13 more The Pig log: Pig Stack Trace --------------- ERROR 1066: Unable to open iterator for alias c org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias c at org.apache.pig.PigServer.openIterator(PigServer.java:521) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:544) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: java.io.IOException: Job terminated with anomalous status FAILED at org.apache.pig.PigServer.openIterator(PigServer.java:515) ... 6 more