Hi guys, Cassandra: 1.1.1 Size: 6, even token distribution, random partitioner JVM 1.6.0_31 kernel: 2.6.32-71.el6.x86_64 24 cores, 96 GB RAM each
We're seeing something pretty distressing in our cluster. When a node is brought down using "nodetool drain" and then brought back up, some of our counters will have a sudden jump in increments as if large number of logs are being replayed that shouldn't be. This effect is unpredictable. First: Is this the correct manner to shut down a node? I'm finding it difficult to get a straight answer on this from google. I've had this happen even when using kill, however. The problem aflicts multiple column families that all have very similar definitions. Here is one in particular to fill the info void: column_type = 'Super' and comparator = 'CompositeType(org.apache.cassandra.db.marshal.AsciiType,org.apache.cassandra.db.marshal.AsciiType,org.apache.cassandra.db.marshal.AsciiType,org.apache.cassandra.db.marshal.AsciiType)' and subcomparator = 'AsciiType' and default_validation_class = 'CounterColumnType' and key_validation_class = 'AsciiType' and read_repair_chance = 0.75 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY' and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; Please let me know if more info would be needed to figure this out, or where I should be looking. Nothing is standing out in the logs on restart but right before the process dies from the drain command I see this: ERROR [CompactionExecutor:113] 2012-06-22 15:22:05,659 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[CompactionExecutor:113,1,RMI Runtime] java.util.concurrent.RejectedExecutionException at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:215) at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:397) at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:470) at org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:67) at org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:787) at org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:358) at org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:330) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:324) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:253) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:968) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200) at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Thanks Charles