So we created a script to check if Cassandra is alive and run it every two minutes. Here are some results for today:
Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again Tue Oct 11 19:00:10 UTC 2011 - F this Cassandra bullshit... it died again Tue Oct 11 19:30:10 UTC 2011 - F this Cassandra bullshit... it died again Tue Oct 11 20:02:10 UTC 2011 - F this Cassandra bullshit... it died again Tue Oct 11 21:34:10 UTC 2011 - F this Cassandra bullshit... it died again Tue Oct 11 22:06:10 UTC 2011 - F this Cassandra bullshit... it died again And here are some of the log tails: INFO [CompactionExecutor:1] 2011-10-11 18:58:14,909 CompactionManager.java (line 395) Compacting [] INFO [FlushWriter:10] 2011-10-11 18:58:14,951 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/ system/HintsColumnFamily-f-568-Data.db (60 bytes) INFO [FlushWriter:10] 2011-10-11 18:58:14,951 Memtable.java (line 157) Writing Memtable-HintsColumnFamily@1493400027(0 bytes, 1 operations) INFO [FlushWriter:10] 2011-10-11 18:58:14,991 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/HintsColumnFamily-f-569-Data.db (61 bytes) INFO [FlushWriter:10] 2011-10-11 18:58:14,991 Memtable.java (line 157) Writing Memtable-HintsColumnFamily@1932871300(0 bytes, 1 operations) INFO [FlushWriter:10] 2011-10-11 18:58:15,031 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/HintsColumnFamily-f-570-Data.db (61 bytes) INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,906 SSTable.java (line 147) Deleted /var/lib/cassandra/data/ system/HintsColumnFamily-f-1066 INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,906 SSTable.java (line 147) Deleted /var/lib/cassandra/data/system/HintsColumnFamily-f-1098 INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,906 SSTable.java (line 147) Deleted /var/lib/cassandra/data/system/HintsColumnFamily-f-1040 INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,906 SSTable.java (line 147) Deleted /var/lib/cassandra/data/system/HintsColumnFamily-f-1071 INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,907 SSTable.java (line 147) Deleted /var/lib/cassandra/data/system/HintsColumnFamily-f-1093 INFO [FlushWriter:8] 2011-10-11 20:00:10,701 Memtable.java (line 157) Writing Memtable-HintsColumnFamily@ 1488536311(0 bytes, 1 operations) INFO [CompactionExecutor:1] 2011-10-11 20:00:10,701 CompactionManager.java (line 395) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily-f-1687-Data.db'),SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily-f-1688-Data.db'),SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily-f-1689-Data.db'),SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily-f-1690-Data.db')] INFO [FlushWriter:8] 2011-10-11 20:00:10,741 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/HintsColumnFamily-f-1691-Data.db (61 bytes) INFO [NonPeriodicTasks:1] 2011-10-11 21:33:26,980 SSTable.java (line 147) Deleted /var/lib/cassandra/data/ system/HintsColumnFamily-f-3349 ERROR [Thread-18] 2011-10-11 21:33:31,452 AbstractCassandraDaemon.java (line 132) Fatal exception in thread Thread[Thread-18,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:76) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:385) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:114) ERROR [Thread-19] 2011-10-11 22:04:39,195 AbstractCassandraDaemon.java (line 132) Fatal exception in thread Thread[Thread-19,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:76) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:385) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:114) I'm going to increase the logging level to DEBUG. Other than that I've got to say that Cassandra 0.7.9 is F'ed in some way or another.