For a very common example, a lot of clusters are now using the k8ssandra
operator in AWS, which needs EBS.  It's incredibly easy to fall behind on
compaction there.  It's why I'm so interested in seeing CASSANDRA-15452 get
merged in.  I've dealt with quite a few of these clusters, in fact I just
worked on one this week.  They're now happily running UCS on 5.0.

Like it or not, LCS is a poor fit for a non-trivial number of teams.  Not
saying STCS doesn't have some poor use cases, but read amplification from
reading lots of SSTables is generally better for the end user than being
thousands of compactions behind.  I'm trying to do the least amount of harm
to the fewest number of teams.

@Andy - you can set the default compaction strategy in C* yaml now.

# default_compaction:
#   class_name: SizeTieredCompactionStrategy
#   parameters:
#     min_threshold: 4
#     max_threshold: 32

Jon


On Fri, Dec 6, 2024 at 8:58 PM Dinesh Joshi <djo...@apache.org> wrote:

> I’m genuinely curious to understand how is defaulting to LCS going to
> cause a nightmare? I am not sure what the concern is over here.
>
> On Fri, Dec 6, 2024 at 8:53 PM Jon Haddad <j...@rustyrazorblade.com> wrote:
>
>> You're ignoring the other side here.  For the folks who *can't* use LCS,
>> defaulting to it is a nightmare.
>>
>> Sorry, but you can't screw over 20% of the community to make life a
>> little better for the 80%.  This is a terrible tradeoff.
>>
>>
>> Jon
>>
>> On Fri, Dec 6, 2024 at 8:36 PM Dinesh Joshi <djo...@apache.org> wrote:
>>
>>> I would argue that vast majority of real world workloads are read heavy.
>>> LCS would therefore be a net benefit for the average user.
>>>
>>> To mitigate the write amplification concern I would make this change and
>>> make sure it is well documented for operators so they’re not caught off
>>> guard.
>>>
>>> On Fri, Dec 6, 2024 at 8:06 PM Jeff Jirsa <jji...@gmail.com> wrote:
>>>
>>>> And it works for that most of the time, so what’s the concern? “You
>>>> lose throughput because iops / write amplification go up, so the perf of
>>>> the default install goes down” ? (But the cost per byte goes way down,
>>>> too)?
>>>>
>>>>
>>>>
>>>> On Dec 6, 2024, at 8:01 PM, Brad <bscho...@gmail.com> wrote:
>>>>
>>>> > Could you elaborate what you mean by 'disk storage management'?
>>>>
>>>> I often see clusters use LCS as an easy fix to avoid the 50% disk free
>>>> recommendation of STCS without considering the write
>>>> magnification implications.
>>>>
>>>> On Fri, Dec 6, 2024 at 10:46 PM Dinesh Joshi <djo...@apache.org> wrote:
>>>>
>>>>> Could you elaborate what you mean by 'disk storage management'?
>>>>>
>>>>> On Fri, Dec 6, 2024 at 7:30 PM Brad <bscho...@gmail.com> wrote:
>>>>>
>>>>>> I'm -1 on LCS being the default, seen far too many people use it for
>>>>>> disk storage management
>>>>>>
>>>>>> On Fri, Dec 6, 2024 at 10:08 PM Jon Haddad <j...@rustyrazorblade.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I'm -1 on LCS being the default, since using it in the wrong
>>>>>>> situations renders clusters inoperable.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Dec 6, 2024 at 7:03 PM Paulo Motta <pa...@apache.org> wrote:
>>>>>>>
>>>>>>>> > I'd prefer to see the default go from STCS to UCS
>>>>>>>>
>>>>>>>> I’m proposing this for latest unstable (cassandra_latest.yaml)
>>>>>>>> since it’s a more recent strategy still being adopted. For latest 
>>>>>>>> stable
>>>>>>>> (cassandra.yaml) I’d prefer LCS since it does not need tuning to 
>>>>>>>> support
>>>>>>>> mutable workloads (UPDATE/DELETE) and is battle-tested.
>>>>>>>>
>>>>>>>> On Fri, 6 Dec 2024 at 21:37 Jon Haddad <j...@rustyrazorblade.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I'd prefer to see the default go from STCS to UCS, probably with
>>>>>>>>> scaling_parameters T4.  That's essentially the same as STCS but 
>>>>>>>>> without the
>>>>>>>>> ridiculous SSTable growth, allowing us to leverage the fast streaming 
>>>>>>>>> path
>>>>>>>>> more often.  I don't think there's any valid use cases for STCS 
>>>>>>>>> anymore now
>>>>>>>>> that we have UCS.
>>>>>>>>>
>>>>>>>>> That said, many have taken issue with the state of UCS docs,
>>>>>>>>> myself included, so that would need to be addressed with any default 
>>>>>>>>> change.
>>>>>>>>>
>>>>>>>>> I don't think we should mark TWCS as experimental.  Maybe we
>>>>>>>>> prevent repairs to tables using TWCS, or do a better job of 
>>>>>>>>> encouraging
>>>>>>>>> folks to use incremental repair at higher frequencies.  It's 
>>>>>>>>> definitely not
>>>>>>>>> experimental though.
>>>>>>>>>
>>>>>>>>> Side note: I think experimental has been over-used and has lost
>>>>>>>>> all meaning.  How is Java 17 experimental?  Very confusing for the
>>>>>>>>> community.
>>>>>>>>>
>>>>>>>>> I think TWCS should use UCS under the hood which would address
>>>>>>>>> streaming performance (and thus node density) or UCS could be updated 
>>>>>>>>> to
>>>>>>>>> allow for time window's options.  Either would solve issue #3 in your 
>>>>>>>>> list.
>>>>>>>>>
>>>>>>>>> Jon
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Dec 6, 2024 at 5:36 PM Paulo Motta <pa...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> It’s 2024 and users are still facing issues due to misconfigured
>>>>>>>>>> compaction when using default configuration.
>>>>>>>>>>
>>>>>>>>>> I would like to start a conversation around improving compaction
>>>>>>>>>> defaults in 5.1/trunk, so users trying out CQL transactions don’t 
>>>>>>>>>> need to
>>>>>>>>>> worry about tuning compaction.
>>>>>>>>>>
>>>>>>>>>> A few suggestions:
>>>>>>>>>>
>>>>>>>>>> 1) Make LeveledCompactionStrategy default on cassandra.yaml, UCS
>>>>>>>>>> default on cassandra_latest.yaml ?
>>>>>>>>>>
>>>>>>>>>> 2) Does TWCS work out of the box with repairs and hints? My
>>>>>>>>>> understanding is that due to CASSANDRA-10496 this causes droppable
>>>>>>>>>> tombstone issues when in combination with repair and hints (see more 
>>>>>>>>>> on
>>>>>>>>>> this thread [1]). We should either fix this or mark TWCS 
>>>>>>>>>> experimental.
>>>>>>>>>>
>>>>>>>>>> 3) When STCS is used with deletions/TTL, tombstones accumulate in
>>>>>>>>>> higher level stables when unchecked_tombstone_compaction is disabled 
>>>>>>>>>> (see
>>>>>>>>>> CASSANDRA-6563). I propose having adding a new setting “auto” 
>>>>>>>>>> enabled by
>>>>>>>>>> default that will have this set to true when STCS/TWCS is used.
>>>>>>>>>>
>>>>>>>>>> I believe addressing these points will improve user experience
>>>>>>>>>> with Cassandra.
>>>>>>>>>>
>>>>>>>>>> I apologize in advance if these topics were discussed in recent
>>>>>>>>>> threads. I would be happy to get  pointers of related discussions on 
>>>>>>>>>> this
>>>>>>>>>> topic.
>>>>>>>>>>
>>>>>>>>>> I will be happy to create JIRA if there’s agreement on addressing
>>>>>>>>>> these items.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Paulo
>>>>>>>>>>
>>>>>>>>>> [1] -
>>>>>>>>>>
>>>>>>>>>> https://user.cassandra.apache.narkive.com/VQOacfnT/twcs-repair-create-new-buckets-with-old-data
>>>>>>>>>>
>>>>>>>>>
>>>>

Reply via email to