err not count in your case, but same symptom, cassandra can't return the answer to your query in the configured rpctimeout time
cheers, jesse -- jesse mcconnell jesse.mcconn...@gmail.com On Mon, Apr 19, 2010 at 19:40, Jesse McConnell <jesse.mcconn...@gmail.com> wrote: > most likely means that the count() operation is taking too long for > the configured RPCTimeout > > counts get unreliable after a certain number of columns under a key in > my experience > > jesse > > -- > jesse mcconnell > jesse.mcconn...@gmail.com > > > > On Mon, Apr 19, 2010 at 19:12, Joost Ouwerkerk <jo...@openplaces.org> wrote: >> I'm slowly getting somewhere with Cassandra... I have successfully imported >> 1.5 million rows using MapReduce. This took about 8 minutes on an 8-node >> cluster, which is comparable to the time it takes with HBase. >> Now I'm having trouble scanning this data. I've created a simple MapReduce >> job that counts rows in my ColumnFamily. The Job fails with most tasks >> throwing the following Exception. Anyone have any ideas what's going wrong? >> java.lang.RuntimeException: TimedOutException() >> >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:165) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:215) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:97) >> at >> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) >> at >> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:91) >> at >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) >> at >> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> at org.apache.hadoop.mapred.Child.main(Child.java:170) >> Caused by: TimedOutException() >> at >> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11015) >> at >> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:623) >> at >> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:597) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:142) >> ... 11 more >> >> On Sun, Apr 18, 2010 at 6:01 PM, Stu Hood <stu.h...@rackspace.com> wrote: >>> >>> In 0.6.0 and trunk, it is located at >>> src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java >>> >>> You might be using a pre-release version of 0.6 if you are seeing a fat >>> client based InputFormat. >>> >>> >>> -----Original Message----- >>> From: "Joost Ouwerkerk" <jo...@openplaces.org> >>> Sent: Sunday, April 18, 2010 4:53pm >>> To: user@cassandra.apache.org >>> Subject: Re: Help with MapReduce >>> >>> Where is the ColumnFamilyInputFormat that uses Thrift? I don't actually >>> have a preference about client, I just want to be consistent with >>> ColumnInputFormat. >>> >>> On Sun, Apr 18, 2010 at 5:37 PM, Stu Hood <stu.h...@rackspace.com> wrote: >>> >>> > ColumnFamilyInputFormat no longer uses the fat client API, and instead >>> > uses >>> > Thrift. There are still some significant problems with the fat client, >>> > so it >>> > shouldn't be used without a good understanding of those problems. >>> > >>> > If you still want to use it, check out contrib/bmt_example, but I'd >>> > recommend that you use thrift for now. >>> > >>> > -----Original Message----- >>> > From: "Joost Ouwerkerk" <jo...@openplaces.org> >>> > Sent: Sunday, April 18, 2010 2:59pm >>> > To: user@cassandra.apache.org >>> > Subject: Help with MapReduce >>> > >>> > I'm a Cassandra noob trying to validate Cassandra as a viable >>> > alternative >>> > to >>> > HBase (which we've been using for over a year) for our application. So >>> > far, >>> > I've had no success getting Cassandra working with MapReduce. >>> > >>> > My first step is inserting data into Cassandra. I've created a MapRed >>> > job >>> > based using the fat client API. I'm using the fat client (StorageProxy) >>> > because that's what ColumnFamilyInputFormat uses and I want to use the >>> > same >>> > API for both read and write jobs. >>> > >>> > When I call StorageProxy.mutate(), nothing happens. The job completes >>> > as >>> > if >>> > it had done something, but in fact nothing has changed in the cluster. >>> > When >>> > I call StorageProxy.mutateBlocking(), I get an IOException complaining >>> > that >>> > there is no connection to the cluster. I've concluded with the debugger >>> > that StorageService is not connecting to the cluster, even though I've >>> > specified the correct seed and ListenAddress (I've using the exact same >>> > storage-conf.xml as the nodes in the cluster). >>> > >>> > I'm sure I'm missing something obvious in the configuration or my setup, >>> > but >>> > since I'm new to Cassandra, I can't see what it is. >>> > >>> > Any help appreciated, >>> > Joost >>> > >>> > >>> > >>> >>> >> >> >