The cluster looks unbalanced (assuming the Random Partitioner), did you manually assign tokens to the nodes? The section on Token Select here some some tips http://wiki.apache.org/cassandra/Operations
One of the nodes in the cluster is down. Is there anything in the log to explain why ? You may have some other errors.
Also want to check:
- your client has a list of all of the clients, so it could move to another if it was connected to the down node.
- what's the RF and what consistency level are you writing at.
- how long is the hang?
- what happening on the server while the client is hanging? e.g. is it idle or is the CPU going crazy, swapping, iostat
- what timeout are you using with thrift?
Aaron
On 06 Oct, 2010,at 07:28 AM, Jason Horman <jhor...@gmail.com> wrote:
We are experiencing some random hangs while importing data into
Cassandra 0.6.5. The client stack dump is below. We are using Java
Pelops with Thrift r917130. The hang seems random, sometimes millions
of records in, sometimes just a few thousand. It sort of smells like
the JIRA
https://issues.apache.org/jira/browse/CASSANDRA-1175
Has any one else experienced this? Any advice?
Here is a dump from nodetool
Address Status Load Range
Ring
10.192.230.224Down 43.41 GB
25274261893111669883290654807978388961 |<--|
10.248.135.223Up 29.38 GB
34662916595519283353151730886201323030 | ^
10.209.125.235Up 19.83 GB
45387569059876439228162547977665761954 v |
10.206.209.112Up 23.59 GB
105389616365686887162471812716889564402 | ^
10.209.22.3 Up 33.16 GB
148562884084359545011181864444489491335 |-->|
Here is the stack
"RMI TCP Connection(4)-10.246.55223" daemon prio=10
tid=0x00002aaac0194000 nid=0x53b3 runnable [0x000000004b7dc000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0x000000074d23e978> (a java.io.BufferedInputStream)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:126)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrifttransport.TFramedTransport.readFrame(TFramedTransport.java:92)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:85)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:794)
at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:777)
at org.wyki.cassandra.pelops.Mutator$1.execute(Mutator.java:40)