I am trying to understand how sorted runs are picked for compaction with STCS and the docs here don't have enough detail: https://cassandra.apache.org/doc/4.0/cassandra/operating/compaction/stcs.html
These blog post have more detail but I still have a question: * http://distributeddatastore.blogspot.com/2021/06/cassandra-compaction.html * https://shrikantbang.wordpress.com/2014/04/22/size-tiered-compaction-strategy-in-apache-cassandra And the blog posts match what I see in the STCS source, especially getBuckets: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java#L251 My question is whether non-adjacent sorted runs can be merged. To be precise assume there are 3 sorted runs, described by 5 attributes (min-key, max-key, min-ts, max-ts, size) where min-ts and max-ts are the min and max commit timestamps for mutations in that sorted run, min-key and max-key are the min and max key for mutations in that sorted run, size is the size of the sorted run. The sorted runs are: * s1 - min-key=A, max-key=D, min-ts=1, max-ts=99, size=100 * s2 - min-key=A, max-key=D, min-ts=100, max-ts=199, size=10 * s3 - min-key=A, max-key=D, min-ts=200, max-ts=299, size=100 By "adjacent" I mean adjacent when sorted runs are ordered by commit timestamps. From the example above s1 & s2 are adjacent, s2 & s3 are adjacent but s1 and s3 are not adjacent. >From what I saw in the STCS source code, s1 & s3 can end up in the same bucket without s2 because s2 is much smaller. If s1 & s3 were merged then the result might be: * s1+s3: min-key=A, max-key=D, min-ts=1, max-ts=299, size likely >= 100 * s2 - min-key=A, max-key=D, min-ts=100, max-ts=199, size=10 A side-effect from merging non-adjacent sorted runs can be that on a point query, something must be fetched from all sorted runs to determine the value to return. This differs from what happens with RocksDB where the heap code can read from the iterators in (commit timestamp) order and stop once it gets a key (with a value or a tombstone). -- Mark Callaghan mdcal...@gmail.com