Re: Re-evaluate compaction defaults in 5.1/trunk

Tolbert, Andy Fri, 06 Dec 2024 20:56:14 -0800

It's also quite easy for STCS to make clusters inoperable, and it can be
quite difficult to dig yourself out of.   It's not hard to find yourself in
a state where you have old 100GB+ SSTables full of expired data that never
get compacted sitting around for months.

Write amplification is a thing, but in the age of fast storage I'd consider
it less of an issue.  The amount of compaction throughput will be felt
rather quickly and its something you can reason with and account for early
on.  Whereas with STCS the impact may not be felt until way later.  There
are sharp edges around large SSTables with repairs and streaming that are
mostly avoided by using LCS as well.

There's also been some really great improvements that have make LCS work
really great generally; such as CASSANDRA-17931
<https://issues.apache.org/jira/browse/CASSANDRA-17931>, which while not
LCS-specific, can enable compactions to continue being productive even at
incredibly low disk space.  Single SSTable upleveling (CASSANDRA-12526
<https://issues.apache.org/jira/browse/CASSANDRA-12526>) also helps reduce
some of that amplification when the LCS detects an SSTable can be promoted
without rewriting data if it doesn't overlap with SSTables in level its
being promoted to.

If you are running Cassandra as a managed service where customers aren't as
experienced or understand the tradeoffs of the compaction strategies, I
think LCS puts the operators of the Cassandra cluster in a better place.
Read performance is more predictable with LCS with uniform SSTable sizes
(outside of L0) and the levels acting as a way to reduce the number of
SSTables touched per read.  Operationally it is just more predictable.

On the topic of changing the default, I think with the introduction of
UnifiedCompactionStrategy that maybe it's an awkward time to change this.
 Would love to see UCS become proven and production ready where we feel it
could become an accepted default,  until that becomes the case (or not), I
don't think changing the default is right.  I also generally think changing
defaults between releases should be discouraged unless it's generally a no
brainer, and this doesn't feel like one to me.

Maybe a possible approach could be to have a mechanism such that an
operator can declare the default table options for new tables?   This would
allow operators to control what they think the default might be.  This
could extend beyond compaction as I know there are some subjective opinions
about what could have better defaults (compression chunk length,
gc_grace_seconds could be lower (if only_purge_repaired_tombstones
enabled), speculative_retry, etc.).

Thanks,
Andy

On Fri, Dec 6, 2024 at 10:01 PM Brad <[email protected]> wrote:

> > Could you elaborate what you mean by 'disk storage management'?
>
> I often see clusters use LCS as an easy fix to avoid the 50% disk free
> recommendation of STCS without considering the write
> magnification implications.
>
> On Fri, Dec 6, 2024 at 10:46 PM Dinesh Joshi <[email protected]> wrote:
>
>> Could you elaborate what you mean by 'disk storage management'?
>>
>> On Fri, Dec 6, 2024 at 7:30 PM Brad <[email protected]> wrote:
>>
>>> I'm -1 on LCS being the default, seen far too many people use it for
>>> disk storage management
>>>
>>> On Fri, Dec 6, 2024 at 10:08 PM Jon Haddad <[email protected]>
>>> wrote:
>>>
>>>> I'm -1 on LCS being the default, since using it in the wrong situations
>>>> renders clusters inoperable.
>>>>
>>>>
>>>> On Fri, Dec 6, 2024 at 7:03 PM Paulo Motta <[email protected]> wrote:
>>>>
>>>>> > I'd prefer to see the default go from STCS to UCS
>>>>>
>>>>> I’m proposing this for latest unstable (cassandra_latest.yaml) since
>>>>> it’s a more recent strategy still being adopted. For latest stable
>>>>> (cassandra.yaml) I’d prefer LCS since it does not need tuning to support
>>>>> mutable workloads (UPDATE/DELETE) and is battle-tested.
>>>>>
>>>>> On Fri, 6 Dec 2024 at 21:37 Jon Haddad <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I'd prefer to see the default go from STCS to UCS, probably with
>>>>>> scaling_parameters T4.  That's essentially the same as STCS but without 
>>>>>> the
>>>>>> ridiculous SSTable growth, allowing us to leverage the fast streaming 
>>>>>> path
>>>>>> more often.  I don't think there's any valid use cases for STCS anymore 
>>>>>> now
>>>>>> that we have UCS.
>>>>>>
>>>>>> That said, many have taken issue with the state of UCS docs, myself
>>>>>> included, so that would need to be addressed with any default change.
>>>>>>
>>>>>> I don't think we should mark TWCS as experimental.  Maybe we prevent
>>>>>> repairs to tables using TWCS, or do a better job of encouraging folks to
>>>>>> use incremental repair at higher frequencies.  It's definitely not
>>>>>> experimental though.
>>>>>>
>>>>>> Side note: I think experimental has been over-used and has lost all
>>>>>> meaning.  How is Java 17 experimental?  Very confusing for the community.
>>>>>>
>>>>>> I think TWCS should use UCS under the hood which would address
>>>>>> streaming performance (and thus node density) or UCS could be updated to
>>>>>> allow for time window's options.  Either would solve issue #3 in your 
>>>>>> list.
>>>>>>
>>>>>> Jon
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Dec 6, 2024 at 5:36 PM Paulo Motta <[email protected]> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> It’s 2024 and users are still facing issues due to misconfigured
>>>>>>> compaction when using default configuration.
>>>>>>>
>>>>>>> I would like to start a conversation around improving compaction
>>>>>>> defaults in 5.1/trunk, so users trying out CQL transactions don’t need 
>>>>>>> to
>>>>>>> worry about tuning compaction.
>>>>>>>
>>>>>>> A few suggestions:
>>>>>>>
>>>>>>> 1) Make LeveledCompactionStrategy default on cassandra.yaml, UCS
>>>>>>> default on cassandra_latest.yaml ?
>>>>>>>
>>>>>>> 2) Does TWCS work out of the box with repairs and hints? My
>>>>>>> understanding is that due to CASSANDRA-10496 this causes droppable
>>>>>>> tombstone issues when in combination with repair and hints (see more on
>>>>>>> this thread [1]). We should either fix this or mark TWCS experimental.
>>>>>>>
>>>>>>> 3) When STCS is used with deletions/TTL, tombstones accumulate in
>>>>>>> higher level stables when unchecked_tombstone_compaction is disabled 
>>>>>>> (see
>>>>>>> CASSANDRA-6563). I propose having adding a new setting “auto” enabled by
>>>>>>> default that will have this set to true when STCS/TWCS is used.
>>>>>>>
>>>>>>> I believe addressing these points will improve user experience with
>>>>>>> Cassandra.
>>>>>>>
>>>>>>> I apologize in advance if these topics were discussed in recent
>>>>>>> threads. I would be happy to get  pointers of related discussions on 
>>>>>>> this
>>>>>>> topic.
>>>>>>>
>>>>>>> I will be happy to create JIRA if there’s agreement on addressing
>>>>>>> these items.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Paulo
>>>>>>>
>>>>>>> [1] -
>>>>>>>
>>>>>>> https://user.cassandra.apache.narkive.com/VQOacfnT/twcs-repair-create-new-buckets-with-old-data
>>>>>>> at
>>>>>>>
>>>>>>

Re: Re-evaluate compaction defaults in 5.1/trunk

Reply via email to