Same. I can’t think of a scenario beyond just writes out pacing compaction throughput. What’s the 20%?
Chris Sent from my iPhone On Dec 6, 2024, at 10:58 PM, Dinesh Joshi <djo...@apache.org> wrote:
I’m genuinely curious to understand how is defaulting to LCS going to cause a nightmare? I am not sure what the concern is over here. You're ignoring the other side here. For the folks who *can't* use LCS, defaulting to it is a nightmare.
Sorry, but you can't screw over 20% of the community to make life a little better for the 80%. This is a terrible tradeoff.
I would argue that vast majority of real world workloads are read heavy. LCS would therefore be a net benefit for the average user.
To mitigate the write amplification concern I would make this change and make sure it is well documented for operators so they’re not caught off guard. And it works for that most of the time, so what’s the concern? “You lose throughput because iops / write amplification go up, so the perf of the default install goes down” ? (But the cost per byte goes way down, too)?
> Could you elaborate what you mean by 'disk storage management'?
I often see clusters use LCS as an easy fix to avoid the 50% disk free recommendation of STCS without considering the write magnification implications. Could you elaborate what you mean by 'disk storage management'?
I'm -1 on LCS being the default, seen far too many people use it for disk storage management
I'm -1 on LCS being the default, since using it in the wrong situations renders clusters inoperable.
> I'd prefer to see the default go from STCS to UCS
I’m proposing this for latest unstable (cassandra_latest.yaml) since it’s a more recent strategy still being adopted. For latest stable (cassandra.yaml) I’d prefer LCS since it does not need tuning to support mutable workloads (UPDATE/DELETE) and is battle-tested. I'd prefer to see the default go from STCS to UCS, probably with scaling_parameters T4. That's essentially the same as STCS but without the ridiculous SSTable growth, allowing us to leverage the fast streaming path more often. I don't think there's any valid use cases for STCS anymore now that we have UCS.
That said, many have taken issue with the state of UCS docs, myself included, so that would need to be addressed with any default change.
I don't think we should mark TWCS as experimental. Maybe we prevent repairs to tables using TWCS, or do a better job of encouraging folks to use incremental repair at higher frequencies. It's definitely not experimental though.
Side note: I think experimental has been over-used and has lost all meaning. How is Java 17 experimental? Very confusing for the community.
I think TWCS should use UCS under the hood which would address streaming performance (and thus node density) or UCS could be updated to allow for time window's options. Either would solve issue #3 in your list.
Hi,
It’s 2024 and users are still facing issues due to misconfigured compaction when using default configuration.
I would like to start a conversation around improving compaction defaults in 5.1/trunk, so users trying out CQL transactions don’t need to worry about tuning compaction.
A few suggestions:
1) Make LeveledCompactionStrategy default on cassandra.yaml, UCS default on cassandra_latest.yaml ?
2) Does TWCS work out of the box with repairs and hints? My understanding is that due to CASSANDRA-10496 this causes droppable tombstone issues when in combination with repair and hints (see more on this thread [1]). We should either fix this or mark TWCS experimental.
3) When STCS is used with deletions/TTL, tombstones accumulate in higher level stables when unchecked_tombstone_compaction is disabled (see CASSANDRA-6563). I propose having adding a new setting “auto” enabled by default that will have this set to true when STCS/TWCS is used.
I believe addressing these points will improve user experience with Cassandra.
I apologize in advance if these topics were discussed in recent threads. I would be happy to get pointers of related discussions on this topic.
I will be happy to create JIRA if there’s agreement on addressing these items.
Thanks,
Paulo
|