The node is overloaded with hints. I'll just grab the comments from codeā¦
// avoid OOMing due to excess hints. we need to do this check even for "live" nodes, since we can // still generate hints for those if it's overloaded or simply dead but not yet known-to-be-dead. // The idea is that if we have over maxHintsInProgress hints in flight, this is probably due to // a small number of nodes causing problems, so we should avoid shutting down writes completely to // healthy nodes. Any node with no hintsInProgress is considered healthy. Are the nodes going up and down a lot ? Are they under GC pressure. The other possibility is that you have overloaded the cluster. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/03/2012, at 3:20 AM, Thomas van Neerijnen wrote: > Hi all > > I'm running into a weird error on Cassandra 1.0.7. > As my clusters load gets heavier many of the nodes seem to hit the same error > around the same time, resulting in MutationStage backing up and never > clearing down. The only way to recover the cluster is to kill all the nodes > and start them up again. The error is as below and is repeated continuously > until I kill the Cassandra process. > > ERROR [ReplicateOnWriteStage:57] 2012-03-21 14:02:05,099 > AbstractCassandraDaemon.java (line 139) Fatal exception in thread > Thread[ReplicateOnWriteStage:57,5,main] > java.lang.RuntimeException: java.util.concurrent.TimeoutException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1227) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.util.concurrent.TimeoutException > at > org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:301) > at > org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:544) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1223) > ... 3 more >