Sudden massive counter increments on node restart

Charles Brophy Fri, 22 Jun 2012 15:40:56 -0700

Hi guys,

Cassandra: 1.1.1
Size: 6, even token distribution, random partitioner
JVM 1.6.0_31
kernel: 2.6.32-71.el6.x86_64
24 cores, 96 GB RAM each


We're seeing something pretty distressing in our cluster. When a node is
brought down using "nodetool drain" and then brought back up, some of our
counters will have a sudden jump in increments as if large number of logs
are being replayed that shouldn't be. This effect is unpredictable.
First: Is this the correct manner to shut down a node? I'm finding it
difficult to get a straight answer on this from google. I've had this
happen even when using kill, however.

The problem aflicts multiple column families that all have very similar
definitions. Here is one in particular to fill the info void:

  column_type = 'Super'
  and comparator =
'CompositeType(org.apache.cassandra.db.marshal.AsciiType,org.apache.cassandra.db.marshal.AsciiType,org.apache.cassandra.db.marshal.AsciiType,org.apache.cassandra.db.marshal.AsciiType)'
  and subcomparator = 'AsciiType'
  and default_validation_class = 'CounterColumnType'
  and key_validation_class = 'AsciiType'
  and read_repair_chance = 0.75
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy =
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
  and caching = 'KEYS_ONLY'
  and compression_options = {'chunk_length_kb' : '64',
'sstable_compression' :
'org.apache.cassandra.io.compress.SnappyCompressor'};

Please let me know if more info would be needed to figure this out, or
where I should be looking. Nothing is standing out in the logs on restart
but right before the process dies from the drain command I see this:

ERROR [CompactionExecutor:113] 2012-06-22 15:22:05,659
AbstractCassandraDaemon.java (line 134) Exception in thread
Thread[CompactionExecutor:113,1,RMI Runtime]
java.util.concurrent.RejectedExecutionException
        at
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768)
        at
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
        at
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:215)
        at
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:397)
        at
java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:470)
        at
org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:67)
        at
org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:787)
        at
org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:358)
        at
org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:330)
        at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:324)
        at
org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:253)
        at
org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:968)
        at
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200)
        at
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

Thanks
Charles

Sudden massive counter increments on node restart

Reply via email to