Hi, I've got a number of questions when looking into LCS in Cassandra. Could somebody help to enlighten me?
1. Will LCS always strive to clean up L0 sstable? i.e. whenever a new L0 sstable shows up, it will trigger LCS compaction to upgrade it into higher level? If that’s what’s happening, what sstable(s) are involved in this compaction? Just this new L0 sstable or more sstables will be involved? 2. If L[N] still hasn’t reached its threshold (10^N sstables) then no sstable will be placed in L[N+1], correct? 3. How do the partitions in a sstable move up the levels? Assuming I’ve got a 500MB L0 sstable and half of the partitions in there have new partition keys, the other 25% match existing partition keys in L1, and 25% match existing partition keys in L2, also assuming L0 only has this one sstable, L1 already has 10 sstables, L2 already has 100 sstables, when LCS works on compacting this 500MB L0 sstable, where does it decide to move the content to higher levels? What about a situation where this 500MB L0 sstable has all new partition keys (never inserted into this CQL table before) and L1 and L2 are already at threshold 10 and 100 respectively? I’ve read this blog http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra, but it seems that we only specify this rule “Within each level, sstables are guaranteed to be non-overlapping. Each level is ten times as large as the previous”, and the answers to the above questions are subject to implementation details. Lastly, I noticed the following statement at the end of that blog: “Leveled compaction ignores the concurrent_compactors setting. Concurrent compaction is designed to avoid tiered compaction’s problem of a backlog of small compaction sets becoming blocked temporarily while the compaction system is busy with a large set. Leveled compaction does not have this problem, since all compaction sets are roughly the same size. Leveled compaction does honor the multithreaded_compaction setting, which allows using one thread per sstable to speed up compaction.” Is it still accurate? It appears that multithreaded_compaction is removed from the recent Cassandra versions.