I'm running a 0.6.0 cluster with four nodes and one of them just crashed. The logs all seem normal and I haven't seen anything special in the jmx counters before the crash.
I have one client writing and reading using 10 threads and using 3 different column families: KvAds, KvImpressions and KvUsers the client had got a few UnavailableException, TimedOutException and TTransportException but was able to complete the read/write operation by failing over to another available host. I can't tell if the exceptions were from the crashed host or from other hosts in the ring. Any hints how to investigate this are greatly appreciated. So far I'm lost... Here's a snippet from the log just before it went down. It doesn't seem to have anything special in it, everything is INFO level. The only thing that seems a bit strange is that last message: Compacting []. This message usually comes with things inside the [], such as Compacting [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/system/LocationInfo-1-Data.db'),...] but this time it was just empty. However, this is not the only place in the log were I see an empty Compacting []. There are other places and they didn't end up in a crash, so I don't know if it's related. here's the log: INFO [ROW-MUTATION-STAGE:6] 2010-04-14 05:55:07,014 ColumnFamilyStore.java (line 357) KvImpressions has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271238432773.log', position=68606651) INFO [ROW-MUTATION-STAGE:6] 2010-04-14 05:55:07,015 ColumnFamilyStore.java (line 609) Enqueuing flush of Memtable(KvImpressions)@258729366 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:55:07,015 Memtable.java (line 148) Writing Memtable(KvImpressions)@258729366 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:55:10,130 Memtable.java (line 162) Completed flushing /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-24-Data.db INFO [COMMIT-LOG-WRITER] 2010-04-14 05:55:10,154 CommitLog.java (line 407) Discarding obsolete commit log:CommitLogSegment(/outbrain/cassdata/commitlog/CommitLog-1271238049425.log) INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,415 SSTableDeletingReference.java (line 104) Deleted /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-16-Data.db INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,440 SSTableDeletingReference.java (line 104) Deleted /outbrain/cassdata/data/outbrain_kvdb/KvAds-8-Data.db INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,454 SSTableDeletingReference.java (line 104) Deleted /outbrain/cassdata/data/outbrain_kvdb/KvAds-10-Data.db INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,526 SSTableDeletingReference.java (line 104) Deleted /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-5-Data.db INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,585 SSTableDeletingReference.java (line 104) Deleted /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-11-Data.db INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,602 SSTableDeletingReference.java (line 104) Deleted /outbrain/cassdata/data/outbrain_kvdb/KvAds-11-Data.db INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,614 SSTableDeletingReference.java (line 104) Deleted /outbrain/cassdata/data/outbrain_kvdb/KvAds-9-Data.db INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,682 SSTableDeletingReference.java (line 104) Deleted /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-21-Data.db INFO [COMMIT-LOG-WRITER] 2010-04-14 05:55:52,254 CommitLogSegment.java (line 50) Creating new commitlog segment /outbrain/cassdata/commitlog/CommitLog-1271238952254.log INFO [ROW-MUTATION-STAGE:16] 2010-04-14 05:56:25,347 ColumnFamilyStore.java (line 357) KvImpressions has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271238952254.log', position=47568158) INFO [ROW-MUTATION-STAGE:16] 2010-04-14 05:56:25,348 ColumnFamilyStore.java (line 609) Enqueuing flush of Memtable(KvImpressions)@1955587316 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:56:25,348 Memtable.java (line 148) Writing Memtable(KvImpressions)@1955587316 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:56:30,572 Memtable.java (line 162) Completed flushing /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-25-Data.db INFO [COMMIT-LOG-WRITER] 2010-04-14 05:57:26,790 CommitLogSegment.java (line 50) Creating new commitlog segment /outbrain/cassdata/commitlog/CommitLog-1271239046790.log INFO [ROW-MUTATION-STAGE:7] 2010-04-14 05:57:59,513 ColumnFamilyStore.java (line 357) KvImpressions has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271239046790.log', position=24265615) INFO [ROW-MUTATION-STAGE:7] 2010-04-14 05:57:59,513 ColumnFamilyStore.java (line 609) Enqueuing flush of Memtable(KvImpressions)@1617250066 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:57:59,513 Memtable.java (line 148) Writing Memtable(KvImpressions)@1617250066 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:58:04,604 Memtable.java (line 162) Completed flushing /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-26-Data.db INFO [COMPACTION-POOL:1] 2010-04-14 05:58:04,605 CompactionManager.java (line 246) Compacting [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-23-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-24-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-25-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-26-Data.db')] INFO [COMMIT-LOG-WRITER] 2010-04-14 05:58:04,652 CommitLog.java (line 407) Discarding obsolete commit log:CommitLogSegment(/outbrain/cassdata/commitlog/CommitLog-1271238952254.log) INFO [COMPACTION-POOL:1] 2010-04-14 05:58:25,975 CompactionManager.java (line 326) Compacted to /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-27-Data.db. 458669533/442361674 bytes for 848169 keys. Time: 21369ms. INFO [COMPACTION-POOL:1] 2010-04-14 05:59:14,913 CompactionManager.java (line 246) Compacting [] Thanks