Tested on version 2.0.1 and 2.0.2.

At complete idle running (nothing stored nor queried) I see that some random 
(depending on the tests I do) column family gets compacted over and over again 
(already 48h) . Total data size is only 3.5GB. Column family was created with 
SSTableSize : 10 MB

Using some remote debugging I see (guess) that the loop is created due to some 
extra code at LeveledManifest::getCompactionCandidates in an attempt to use 
STCS if compaction gets behind (see code variable 'score').

In my case I get the following variables during execution (version 2.0.2) of 
LeveledManifest::getCompactionCandidates :
level : 1
sstablesInLevel : 42
remaining : 42
total bytes for remaining : 448 MB
max size for level : 100MB
score 4.27
Due to score of 4.27 the code goes to special branch with variables during 
execution (version 2.0.2) of LeveledManifest::getCompactionCandidates :
                Generations[0].size() : 77
                Candidates : 77
                Pairs : 77
Buckets : one list entry of 77 files
mostInteresting : 32 files

These 32 mostInteresting files are returned to the function 
LeveledCompactionStrategy::getMaximalTask, marked for compaction, and a new 
LeveledCompactionTask is created (Is it not the goal here to create an STCS 
task??) !!

So then this task is doing its job, creating a new set of level 0 files, each 
of 10 MB. Thus again 32 files are created from the 32 files we started from. So 
once the compaction loop restarts, it will do exactly the same thing again and 
again.

Should an STCS task be created from within the LCS strategy? Or the 
optimization simply be removed?

Ignace Desimpel


Reply via email to