To note: I still have the problem in before beta 1.1 custom build that seems to have the fix. I am going to upgrade to 1.1 beta and check if problem will go away and will file a bug if problem still exists. BTW: It would be great for cassandra to exit on any fatal errors, like assertion problems or OOMs.

14.03.12 09:55, aaron morton ???????(??):
Fixed in 1.0.3
https://issues.apache.org/jira/browse/CASSANDRA-3482

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/03/2012, at 3:45 PM, David Hawthorne wrote:

5 node cluster running 1.0.2, doing about 1300 reads and 1300 writes/sec into 3 column families in the same keyspace. 2 client machines, doing about the same amount of reads/writes, but one has an average response time in the 4-40ms range and the other in the 200-800ms range. Both running identical software, homebrew with hector-1.0-3 client.

Traffic was peaking out at 6k reads and 6k writes/sec, according to reporting from our software, and now it's topping out at 1300/sec each. The cpus on the cassy boxes are bored. None of the threads within cassandra are chewing more than 3% cpu. Disk is only 10% full on the most loaded box.

MemtablePostFlusher               1       102             36

Not all servers have the same number of pending tasks. They have 0, 1, 17, 37, and 105.

It looks like it's stuck and not recovering, cuz it's been like this for an hour. I've attached the end of the cassandra.log from the server with the most pending tasks. There are some interesting exceptions in there.

As always, all help is always appreciated!  :p


<cassandra.log>


Reply via email to