I have absolutely no idea what is causing the rejections, they appear to be totally random, on all 3 hosts of my cluster. I cleared all iptables states, and since they all sit on the same switch I don't think it has to do with the underlying network. Is there a connection limit on Cassandra nodes? -- Christian Decker Software Architect http://blog.snyke.net
On Wed, Aug 18, 2010 at 3:28 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > why are you getting connection refused? do you have a firewall problem? > > On Wed, Aug 18, 2010 at 7:17 AM, Christian Decker > <decker.christ...@gmail.com> wrote: > > Hi all, > > I'm trying to get Pig scripts to work on data in Cassandra and right now > I > > want to simply run the example-script.pig on a different Keyspace/CF > > containing ~6'000'000 entries. I got it running but then the job aborts > > after quite some time, and when I look at the logs I see hundreds of > these: > >> > >> java.lang.RuntimeException: > >> org.apache.thrift.transport.TTransportException: > java.net.ConnectException: > >> Connection refused > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:133) > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:224) > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:101) > >> at > >> > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) > >> at > >> > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:95) > >> at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(Unknown > >> Source) > >> at > >> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:142) > >> at > >> > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) > >> at > org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) > >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > >> at org.apache.hadoop.mapred.Child.main(Child.java:170) > >> Caused by: org.apache.thrift.transport.TTransportException: > >> java.net.ConnectException: Connection refused > >> at org.apache.thrift.transport.TSocket.open(TSocket.java:185) > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:129) > >> ... 13 more > >> Caused by: java.net.ConnectException: Connection refused > >> at java.net.PlainSocketImpl.socketConnect(Native Method) > >> at > >> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310) > >> at > >> > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176) > >> at > >> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163) > >> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:381) > >> at java.net.Socket.connect(Socket.java:537) > >> at java.net.Socket.connect(Socket.java:487) > >> at org.apache.thrift.transport.TSocket.open(TSocket.java:180) > >> ... 14 more > > > > and > >> > >> > >> > >> java.lang.RuntimeException: TimedOutException() > >> > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:174) > >> > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:224) > >> > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:101) > >> > >> at > >> > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) > >> > >> at > >> > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) > >> > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:95) > >> > >> at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(Unknown > >> Source) > >> > >> at > >> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:142) > >> > >> at > >> > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) > >> > >> at > org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > >> > >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > >> > >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) > >> > >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > >> > >> at org.apache.hadoop.mapred.Child.main(Child.java:170) > >> > >> Caused by: TimedOutException() > >> > >> at > >> > org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11030) > >> > >> at > >> > org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:623) > >> > >> at > >> > org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:597) > >> > >> at > >> > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:151) > >> > >> ... 13 more > > > > I checked that the cassandra cluster is running and all my 3 nodes are up > > and working. As far as I see it the Jobtracker retries when it get those > > errors but aborts once a large portion have failed. Any idea on why the > > Cluster keeps dropping connections or timing out? > > Regards, > > Chris > > -- > > Christian Decker > > Software Architect > > http://blog.snyke.net > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >