I’ve encountered a handful of spinning platters, but not a lot.

I think we should generally optimize for the common case, not the
exception.


On Thu, Feb 27, 2025 at 9:51 AM Josh McKenzie <jmcken...@apache.org> wrote:

> This is a significant enough performance problem *in normal operations*
> I'd consider it a bug and thus eligible for back-porting. A couple other
> thoughts:
>
> CASSANDRA-20158 and CASSANDRA-20164 are several orders of magnitudes
> faster ... The only problem with this approach is that the tree is not
> rebalanced... it would be trivial to force a full rebuild periodically or
> based on some balance signal.
>
> I'd also be strongly in support of these modifications landing in all
> currently supported branches if someone has the time and energy to step up
> and do the work given the combined current test coverage and robust
> property-based additions to the testing in 19596.
>
> Early open which is enabled by default
>
> Is it fair to assume most current C* nodes will no longer be on spinning
> disk and thus we should change this default to disabled?
>
> On Thu, Feb 27, 2025, at 12:35 PM, Ariel Weisberg wrote:
>
> Hi,
>
> I want to discuss what versions we should backport IntervalTree
> improvements to specifically 19596 which I think is the lower risk option
> because it builds the same trees as before. I think we should at least
> backport to 5.0.
>
> IntervalTree performance has shown up as a problematic bottleneck in a
> couple of scenarios. If a node ends up with lots of small sstables it will
> fall over trying to build IntervalTrees because they are very slow. With
> 100k sstables which isn't unusual in the normal case for leveled compaction
> it takes 200+ milliseconds to build one on my laptop.
>
> When this happens compactions and flushing can't complete and the mutation
> stage backs up along with usual problems associated with a large compaction
> backlog. Early open which is enabled by default makes this much worse
> because it causes compaction to rebuild IntervalTree many times more often.
>
> The problem is the tree is immutable and currently the entire tree is
> rebuilt every time it is updated. The proper fix is to create a persistent
> tree, but no one has had the time at the moment and it's not a great
> candidate for back porting to existing releases.
>
> CASSANDRA-19596 builds identical trees but aims to make the building
> process significantly more efficient by removing a lot of different points
> of overhead. The original tree does a lot of redundant sorting and
> reallocation inside array lists. 19596 properly sizes the arrays when
> allocating and reuses the existing sorted order to do a linear merge of
> additions and removals. This is 2x faster to build a tree from scratch and
> 4x faster to update.
>
> CASSANDRA-20158 and CASSANDRA-20164 are several orders of magnitudes
> faster for some common operations (replace interval, add interval) and work
> by treating the tree as a persistent tree and only rebuilding the changed
> nodes. The only problem with this approach is that the tree is not
> rebalanced, but in practice they still get rebuilt pretty often and it
> would be trivial to force a full rebuild periodically or based on some
> balance signal.
>
> Ariel
>
>
>

Reply via email to