Hi, I want to discuss what versions we should backport IntervalTree improvements to specifically 19596 which I think is the lower risk option because it builds the same trees as before. I think we should at least backport to 5.0.
IntervalTree performance has shown up as a problematic bottleneck in a couple of scenarios. If a node ends up with lots of small sstables it will fall over trying to build IntervalTrees because they are very slow. With 100k sstables which isn't unusual in the normal case for leveled compaction it takes 200+ milliseconds to build one on my laptop. When this happens compactions and flushing can't complete and the mutation stage backs up along with usual problems associated with a large compaction backlog. Early open which is enabled by default makes this much worse because it causes compaction to rebuild IntervalTree many times more often. The problem is the tree is immutable and currently the entire tree is rebuilt every time it is updated. The proper fix is to create a persistent tree, but no one has had the time at the moment and it's not a great candidate for back porting to existing releases. CASSANDRA-19596 builds identical trees but aims to make the building process significantly more efficient by removing a lot of different points of overhead. The original tree does a lot of redundant sorting and reallocation inside array lists. 19596 properly sizes the arrays when allocating and reuses the existing sorted order to do a linear merge of additions and removals. This is 2x faster to build a tree from scratch and 4x faster to update. CASSANDRA-20158 and CASSANDRA-20164 are several orders of magnitudes faster for some common operations (replace interval, add interval) and work by treating the tree as a persistent tree and only rebuilding the changed nodes. The only problem with this approach is that the tree is not rebalanced, but in practice they still get rebuilt pretty often and it would be trivial to force a full rebuild periodically or based on some balance signal. Ariel