Hi all! We've had a problem with cassandra recently. We had 2 one-minute periods when we got a lot of timeouts on the client side (the only timeouts during 9 days we are using cassandra in production). In the logs we've found corresponding messages saying something about MUTATION messages dropped.
Now, the official faq [1] says that this is an indicator that the load is too high. We've checked our monitoring and found out that 1-minute average cpu load had a local peak at the time of the problem, but it was like 0.8 against 0.2 usual which I guess is nothing for a 2 core virtual machine. We've also checked java threads - there was no peak there and their count was reasonable ~240-250. Can anyone give us a hint - what should we monitor to see this "high load" and what should we tune to make it acceptable? Thanks in advance, Alexander [1] http://wiki.apache.org/cassandra/FAQ#dropped_messages