To note: I still have the problem in before beta 1.1 custom build that
seems to have the fix. I am going to upgrade to 1.1 beta and check if
problem will go away and will file a bug if problem still exists.
BTW: It would be great for cassandra to exit on any fatal errors, like
assertion problems or OOMs.
14.03.12 09:55, aaron morton ???????(??):
Fixed in 1.0.3
https://issues.apache.org/jira/browse/CASSANDRA-3482
Cheers
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 14/03/2012, at 3:45 PM, David Hawthorne wrote:
5 node cluster running 1.0.2, doing about 1300 reads and 1300
writes/sec into 3 column families in the same keyspace. 2 client
machines, doing about the same amount of reads/writes, but one has an
average response time in the 4-40ms range and the other in the
200-800ms range. Both running identical software, homebrew with
hector-1.0-3 client.
Traffic was peaking out at 6k reads and 6k writes/sec, according to
reporting from our software, and now it's topping out at 1300/sec
each. The cpus on the cassy boxes are bored. None of the threads
within cassandra are chewing more than 3% cpu. Disk is only 10% full
on the most loaded box.
MemtablePostFlusher 1 102 36
Not all servers have the same number of pending tasks. They have 0,
1, 17, 37, and 105.
It looks like it's stuck and not recovering, cuz it's been like this
for an hour. I've attached the end of the cassandra.log from the
server with the most pending tasks. There are some interesting
exceptions in there.
As always, all help is always appreciated! :p
<cassandra.log>