> It's also been questioned about why we don't just enable settings we recommend. These are settings we recommend for new clusters. *Our existing cassandra.yaml needs to be tailored for existing clusters being upgraded, where we are very conservative about changing defaults.*
I think this unnecessarily penalizes new users with subpar defaults and existing users who wish to use optimized/recommended defaults and need to maintain additional logic to support that. This change offers an opportunity to revisit this. Is not updating the default cassandra.yaml with new recommended configuration just to protect existing clusters from accidentally overriding cassandra.yaml with a new version during major upgrades? If so, perhaps we could add a new explicit flag “enable_major_upgrade: false” to “cassandra.yaml” that fails startup if an upgrade is detected and force operators to review the configuration before a major upgrade? Related to Jeff’s question, I think we need a way to consolidate “latest recommended settings” into “old compatible default” when cutting a new major version, otherwise the files will diverge perpetually. I think cassandra_latest.yaml offers a way to “buffer” proposals for default configuration changes which are consolidated into “cassandra.yaml” in the subsequent major release, eventually converging configurations and reducing the maintenance burden. On Thu, 15 Feb 2024 at 04:24 Mick Semb Wever <m...@apache.org> wrote: > > >> Mick and Ekaterina (and everyone really) - any thoughts on what test >> coverage, if any, we should commit to for this new configuration? >> Acknowledging that we already have *a lot* of CI that we run. >> > > > > Branimir in this patch has already done some basic cleanup of test > variations, so this is not a duplication of the pipeline. It's a > significant improvement. > > I'm ok with cassandra_latest being committed and added to the pipeline, > *if* the authors genuinely believe there's significant time and effort > saved in doing so. > > How many broken tests are we talking about ? > Are they consistently broken or flaky ? > Are they ticketed up and 5.0-rc blockers ? > > Having to deal with flakies and broken tests is an unfortunate reality to > having a pipeline of 170k tests. > > Despite real frustrations I don't believe the broken windows analogy is > appropriate here – it's more of a leave the campground cleaner… That > being said, knowingly introducing a few broken tests is not that either, > but still having to deal with a handful of consistently breaking tests > for a short period of time is not the same cognitive burden as flakies. > There are currently other broken tests in 5.0: VectorUpdateDeleteTest, > upgrade_through_versions_test; are these compounding to the frustrations ? > > It's also been questioned about why we don't just enable settings we > recommend. These are settings we recommend for new clusters. Our existing > cassandra.yaml needs to be tailored for existing clusters being upgraded, > where we are very conservative about changing defaults. > >