Hello I have a three nodes cluster. I noticed one node was always down. Restarting Cassandra fixes it but it will go down again after a couple of days. I'm pretty new to Cassandra so I'm wondering how I should troubleshoot it. Logs is as below.
INFO [StorageServiceShutdownHook] 2014-04-28 13:21:05,091 ThriftServer.java (line 141) Stop listening to thrift clients ERROR [GossipStage:4] 2014-04-28 13:11:59,877 CassandraDaemon.java (line 196) Exception in thread Thread[GossipStage:4,5,main] java.lang.OutOfMemoryError: Java heap space ERROR [ACCEPT-/10.20.132.44] 2014-04-28 13:06:10,261 CassandraDaemon.java (line 196) Exception in thread Thread[ACCEPT-/10.20.132.44,5,main] java.lang.OutOfMemoryError: Java heap space ERROR [GossipStage:3] 2014-04-28 12:54:02,116 CassandraDaemon.java (line 196) Exception in thread Thread[GossipStage:3,5,main] java.lang.OutOfMemoryError: Java heap space ERROR [GossipStage:2] 2014-04-28 12:52:27,644 CassandraDaemon.java (line 196) Exception in thread Thread[GossipStage:2,5,main] java.lang.OutOfMemoryError: Java heap space ERROR [Thread-222] 2014-04-28 12:50:18,689 CassandraDaemon.java (line 196) Exception in thread Thread[Thread-222,5,main] java.lang.OutOfMemoryError: Java heap space ERROR [GossipTasks:1] 2014-04-28 12:47:12,879 CassandraDaemon.java (line 196) Exception in thread Thread[GossipTasks:1,5,main] java.lang.OutOfMemoryError: Java heap space ERROR [GossipTasks:1] 2014-04-28 13:24:59,113 CassandraDaemon.java (line 196) Exception in thread Thread[GossipTasks:1,5,main] java.lang.IllegalThreadStateException at java.lang.Thread.start(Thread.java:704) at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:202) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.handleOrLog(DebuggableThreadPoolExecutor.java:220) at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:79) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) INFO [StorageServiceShutdownHook] 2014-04-28 13:25:35,105 Server.java (line 181) Stop listening for CQL clients INFO [StorageServiceShutdownHook] 2014-04-28 13:25:35,105 Gossiper.java (line 1251) Announcing shutdown INFO [StorageServiceShutdownHook] 2014-04-28 13:26:12,524 MessagingService.java (line 667) Waiting for messaging service to quiesce Thanks Gary