> Python Upgrade DTests today requires 192x large (7 cpu, 14GB ram) servers
> We have far fewer (and more effective?) JVM Upgrade DTests. > There we only need 8x medium (3 cpu, 5GB ram) servers Does anyone have a strong understanding of the coverage and value offered by the python upgrade dtests vs. the in-jvm dtests? I don't, but I intuitively have a hard time believing the value difference matches the hardware requirement difference there. > Lots and lots of words about releases from mick (<3) Those of you who know me know my "spidey-senses" get triggered by enough complexity regardless of how well justified. I feel like our release process has passed this threshold for me. Been talking a lot with Mick about this topic for a couple weeks and I'm curious if the community sees a major flaw with a proposal like the following: • We formally support 3 releases at a time • We only release MAJOR (i.e. semver major). No more "5.0, 5.1, 5.2", would now be "5.0, 6.0, 7.0" • We test and support online upgrades between supported releases • Any removal or API breakage follows a "deprecate-then-release" cycle • We cut a release every 12 months *Implications for operators:* • Upgrade paths for online upgrades are simple and clear. T-2. • "Forced" update cadence to stay on supported versions is 3 years • If you adopt v1.0 it will be supported until v4.0 comes out 36 months later • This gives users the flexibility to prioritize functionality vs. stability and to balance release validation costs • Deprecation cycles are clear as are compatibility paths. • Release timelines and feature availability are predictable and clear *Implications for developers on the project:** * • Support requirements for online upgrades are clear • Opportunity cost of feature slippage relative to release date is balanced (worst-case == 11.99 month delay on availability in GA supported release) • Path to keep code-base maintainable is clear (deprecate-then-remove) • CI requirements are constrained and predictable Moving to a "online upgrades supported for everything" is something I support in principle, but would advocate we consider after getting a handle on our release process. So - what do we lose if we consider the above approach? On Tue, Jan 28, 2025, at 8:23 AM, Mick Semb Wever wrote: > Jordan, replies inline. > > >> To take a snippet from your email "A little empathy for our users goes a >> long way." While I agree clarity is important, forcing our users to upgrade >> multiple times is not in their best interest. > > > Yes – we would be moving in that direction by now saying we aim for online > compatibility across all versions. But how feasible that turns out to be > depends on our future actions and new versions. > > The separation between "the code maintains compatibility across all versions" > versus "we only actively test these upgrade paths so that's our limited > recommendation" is here what lets us reduce the "forcing our users to > upgrade multiple times". That's the "other paths may work but you're on your > own – do your homework" aspect. This is a position that allows us to > progress into something better. > > For now, and using the current status quo of major/minor usage as the > implemented example: this would progress us to no longer needing major > versions (we would just test all upgrade paths for all current maintained > versions, CI resources permitting). > The community can change over time as well, it's worth thinking about an > approach that is adjustable to changing resources. (This includes efforts > required in documenting past, present, future, especially as changes are > made.) > > I emphasise, first I think we need to be focusing on maintaining > compatibility in the code (and how and when we are willing/needing to break > it). > > >> At the same time, doesn't less testing resources primarily translate to >> longer test runs? > > > Too much also saturates the testing cluster to a point where tests become > flaky and fail. ci-cassandra.a.o is already better at exposing flaky tests > than other systems. This is a practicality, and it's constantly being > improved, but only under volunteer time. Donating test hardware is the > simpler ask. > >> Upgrade tests don't need to be run on every commit. When I worked on Riak we >> had very comprehensive upgrade testing (pretty much the full matrix of >> versions) and we had a schedule we ran these tests on ahead of release. > > > We are already struggling to stay on top of failures and flakies with > ~per-commit builds and butler.c.a.o > I'm not against the idea of schedule test runs, but it needs more input and > effort from people in that space for it to accommodate it. > > I am not fond of the idea of "tests ahead of release" – release managers > already do enough and are a scarce resource. Asking them to also be the > build butler and chase down bugs and people to fix them is not appropriate > IMO. I also think it's unwise without guarantee that the > contributor/committer that created the bug is available at release time. > Having just one post-commit pipeline has nice benefits in simplicity, as long > as it's feasible then slow is ok (as you say above). > > >> Could you share some more details on the resource issues and their impacts? > > Python Upgrade DTests and JVM Upgrade DTests. > > Python Upgrade DTests today requires 192x large (7 cpu, 14GB ram) servers, > each taking up to one hour. > Currently we have too many upgrade paths (4.0, 4.1, 5.0, to trunk), and are > seeing builds abort because of timeouts (>1hr). Collected timing numbers > suggest we should double this number to 384, or simply remove upgrade paths > we test. > > https://github.com/apache/cassandra/blob/trunk/.jenkins/Jenkinsfile#L185-L188 > https://github.com/apache/cassandra/blob/trunk/.jenkins/Jenkinsfile#L37 > > We have far fewer (and more effective?) JVM Upgrade DTests. > There we only need 8x medium (3 cpu, 5GB ram) servers. > https://github.com/apache/cassandra/blob/trunk/.jenkins/Jenkinsfile#L177 > >