On a Cassandra 2.2.11 cluster, I noticed estimated compactions accumulating on 
one node. nodetool compactionstats showed the following:
                compaction type    keyspace         table   completed       
total    unit   progress                     Compaction         ks1    
some_table   204.68 MB   204.98 MB   bytes     99.86%   Index summary 
redistribution        null          null   457.72 KB      950 MB   bytes      
0.05%                     Compaction         ks1    some_table   461.61 MB   
461.95 MB   bytes     99.93%           Tombstone Compaction         ks1    
some_table   618.34 MB   618.47 MB   bytes     99.98%                     
Compaction         ks1    some_table   378.37 MB      380 MB   bytes     99.57% 
          Tombstone Compaction         ks1    some_table   326.51 MB   327.63 
MB   bytes     99.66%           Tombstone Compaction         ks2   other_table  
  29.38 MB    29.38 MB   bytes    100.00%           Tombstone Compaction        
 ks1    some_table    503.4 MB   507.28 MB   bytes     99.24%                   
  Compaction         ks1    some_table   353.44 MB   353.47 MB   bytes     
99.99%

They had been like this for a while (all different tables). A thread dump 
showed all 8 CompactionExecutor threads looking like
"CompactionExecutor:6" #84 daemon prio=1 os_prio=4 tid=0x00007f5771172000 
nid=0x7646 waiting on condition [0x00007f578847b000]   java.lang.Thread.State: 
WAITING (parking)        at sun.misc.Unsafe.park(Native Method)        - 
parking to wait for  <0x00000005fe5656e8> (a 
com.google.common.util.concurrent.AbstractFuture$Sync)        at 
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
        at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285)
        at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)   
     at 
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:390)       
 at 
org.apache.cassandra.db.SystemKeyspace.forceBlockingFlush(SystemKeyspace.java:593)
        at 
org.apache.cassandra.db.SystemKeyspace.finishCompaction(SystemKeyspace.java:368)
        at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:205)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)        
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74)
        at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:80)
        at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:257)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)        
at java.util.concurrent.FutureTask.run(FutureTask.java:266)        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
       at java.lang.Thread.run(Thread.java:745)  
A MemtablePostFlush thread was awaiting some flush count down latch
"MemtablePostFlush:1" #30 daemon prio=5 os_prio=0 tid=0x00007f57705dac00 
nid=0x75bf waiting on condition [0x00007f578a8fb000]   java.lang.Thread.State: 
WAITING (parking)        at sun.misc.Unsafe.park(Native Method)        - 
parking to wait for  <0x0000000573da6c90> (a 
java.util.concurrent.CountDownLatch$Sync)        at 
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)   
     at 
org.apache.cassandra.db.ColumnFamilyStore$PostFlush.call(ColumnFamilyStore.java:1073)
        at 
org.apache.cassandra.db.ColumnFamilyStore$PostFlush.call(ColumnFamilyStore.java:1026)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
       at java.lang.Thread.run(Thread.java:745)  

The 4 MemtableFlushWriter threads were all RUNNABLE, sorting something in 
IntervalTree. Finally, the IndexSummaryManager thread was also RUNNABLE:
"IndexSummaryManager:1" #1463 daemon prio=1 os_prio=4 tid=0x00007f577139b000 
nid=0x8100 runnable [0x00007f5726f6c000]   java.lang.Thread.State: RUNNABLE     
   at com.google.common.collect.ImmutableSet.construct(ImmutableSet.java:206)   
     at com.google.common.collect.ImmutableSet.copyOf(ImmutableSet.java:375)    
    at org.apache.cassandra.db.lifecycle.Helpers.replace(Helpers.java:43)       
 at org.apache.cassandra.db.lifecycle.View$2.apply(View.java:166)        at 
org.apache.cassandra.db.lifecycle.View$2.apply(View.java:161)        at 
org.apache.cassandra.db.lifecycle.Tracker.apply(Tracker.java:138)        at 
org.apache.cassandra.db.lifecycle.Tracker.apply(Tracker.java:111)        at 
org.apache.cassandra.db.lifecycle.Tracker.apply(Tracker.java:118)        at 
org.apache.cassandra.db.lifecycle.LifecycleTransaction.unmarkCompacting(LifecycleTransaction.java:445)
        at 
org.apache.cassandra.db.lifecycle.LifecycleTransaction.cancel(LifecycleTransaction.java:400)
        at 
org.apache.cassandra.io.sstable.IndexSummaryRedistribution.adjustSamplingLevels(IndexSummaryRedistribution.java:230)
        at 
org.apache.cassandra.io.sstable.IndexSummaryRedistribution.redistributeSummaries(IndexSummaryRedistribution.java:126)
        at 
org.apache.cassandra.db.compaction.CompactionManager.runIndexSummaryRedistribution(CompactionManager.java:1400)
        at 
org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:250)
        at 
org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:228)
        at 
org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:125)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)        
at 
org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)        
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
       at java.lang.Thread.run(Thread.java:745)  
How should I interpret these? What flushing behavior is blocking the 
compactions? 

Reply via email to