CompactionMetrics is a combination of the compaction executor (sstable compactions, secondary index build, view building, relocate, garbagecollect, cleanup, scrub etc) and validation executor (repairs). Keep in mind not all jobs execute 1 task per operation, things that use the parallelAllSSTableOperation like cleanup will create 1 task per sstable.
The "CompletedTasks" metric is a measure of how many tasks ran on these two executors combined. The "TotalCompactionsCompleted" metric is a measure of how many compactions issued from the compaction manager ran (normal compactions, cache writes, scrub, 2i and MVs). So while they may be close, depending on whats happening on the system, theres no assurance that they will be within any bounds of each other. So I would suspect validation compactions from repairs would be one major difference. If you run other operational tasks there will likely be more. On Mon, Oct 30, 2017 at 12:22 PM, Lucas Benevides < lu...@maurobenevides.com.br> wrote: > Kurt, > > I apreciate your answer but I don't believe CompletedTasks count the > "validation compactions". These are compactions that occur from repair > operations. I am running tests on 10 cluster nodes in the same physical > rack, with Cassandra Stress Tool and I didn't make any Repair commands. The > tables only last for seven hours, so it is not reasonable that tens of > thousands of these validation compactions occur per node. > > I tried to see the code and the CompletedTasks counter seems to be > populated by a method from the class java.util.concurrent. > ThreadPoolExecutor. > So I really don't know what it is but surely is not the amount of > Compaction Completed Tasks. > > Thank you > Lucas Benevides > > - > > > 2017-10-30 8:05 GMT-02:00 kurt greaves <k...@instaclustr.com>: > >> I believe (may be wrong) that CompletedTasks counts Validation >> compactions while TotalCompactionsCompleted does not. Considering a lot of >> validation compactions can be created every repair it might explain the >> difference. I'm not sure why they are named that way or work the way they >> do. There appears to be no documentation around this in the code (what a >> surprise) and looks like it was last touched in CASSANDRA-4009 >> <https://issues.apache.org/jira/browse/CASSANDRA-4009>, which also has >> no useful info. >> >> On 27 October 2017 at 13:48, Lucas Benevides <lu...@maurobenevides.com.br >> > wrote: >> >>> Dear community, >>> >>> I am studying the behaviour of the Cassandra >>> TimeWindowCompactionStragegy. To do so I am watching some metrics. Two of >>> these metrics are important: Compaction.CompletedTasks, a gauge, and the >>> TotalCompactionsCompleted, a Meter. >>> >>> According to the documentation (http://cassandra.apache.org/d >>> oc/latest/operating/metrics.html#table-metrics): >>> Completed Taks = Number of completed compactions since server [re]start. >>> TotalCompactionsCompleted = Throughput of completed compactions since >>> server [re]start. >>> >>> As I realized, the TotalCompactionsCompleted, in the Meter object, has a >>> counter, which I supposed would be numerically close to the CompletedTasks >>> gauge. But they are very different, with the Completed Tasks being much >>> higher than the TotalCompactions Completed. >>> >>> According to the code, in github (class metrics.CompactionMetrics.java): >>> Completed Taks - Number of completed compactions since server [re]start >>> TotalCompactionsCompleted - Total number of compactions since server >>> [re]start >>> >>> Can you help me and explain the difference between these two metrics, as >>> they seem to have very distinct values, with the Completed Tasks being >>> around 1000 times the value of the counter in TotalCompactionsCompleted. >>> >>> Thanks in Advance, >>> Lucas Benevides >>> >>> >> >