*** Proposal *** Aligned to the agreed-upon annual cadence of supported releases, let's use semantic versioning for better ecosystem operatibility, and to promote API awareness and compatibility support from documentation to tests.
*** Background *** The recent¹ dev ML thread 'Releases after 4.0' landed on an annual release cadence, and for promoting an always shippable trunk (repeated again in the roadmap thread²). A digression that occurred in the thread was around the use of semantic versioning, and the possible role of properly using major and minor versions within the annual release cycle. This proposal is an attempt to take those points of view and build them on everything else we have agreed upon so far. *** Ecosystem Operability *** The Cassandra codebase has an ecosystem around it. From downstream projects to vendors providing support for versions to managed DBaaS. We can help them out with semver, and by providing unreleased minor versions through the year. Unreleased means we don’t do a formal Apache release approval, we just bump the version in `build.xml`. Downstream projects face overhead when, either trying to keep up with trunk through each annual development cycle, or trying to rebase against a whole year's worth of development once each year. Unreleased versions will provide safe points for the ecosystem to plug into and keep up with. Vendors are also free to support and provide hot-fixes and back ports on these unreleased versions, outside of the community's efforts or concerns. And of course semver provides a lot of value to downstream codebases. *** API and Compatibility Awareness *** The idea here is to provide awareness and improved documentation to our APIs, their audience, and to what compatibility is required on them. Personally, I still struggle getting my head around all the ways Cassandra can break its APIs and what to think about and to test when coding. This is important for ensuring availability during upgrades (mix-version clusters), and again important if we want to introduce data-safe downgrades. This stuff doesn't get (battle-) tested enough. The native protocol bump to v6 was an example for the need to be better at documenting and testing what's involved (across the ecosystem). The consequences of breaking compatibility range from documentation, and tests, to mixed versioned clusters, upgrade and rollback operations. Semantic versioning is a way of foreseeing and preparing for such changes. In practice this can be done a) using different fixVersions in jira ticket, and b) lazy-incrementing the major version in trunk when the first breaking change lands in the development cycle. For example, we enter the next development cycle with Jira fixVersions of "4.X" and "5.X", and an initial trunk version of "4.1". Then when a committer merges the first "5.X" ticket they bump trunk's version up to "5.0". This approach incentivises patches to be aware and to better document the breakage, and comes with the added benefit for the ecosystem of identifying where in the development cycle the compatibility first broke. Some examples of compatibility areas are CQL, Native Protocol, gossip, JMX, Metrics, Virtual Tables, SSTable, CDC, Commitlog, FQL, and Auditing. Many of these don't have enough documentation of how they are versioned and compatibility. As we add pluggability (i.e. SPIs) both the need to document this, and to be closer with the ecosystem increases. *** Example for 2021-2022 *** Illustrating this in action, with a cadence of a minor version every quarter, - today, we branch `cassandra-4.0` and increment trunk to 4.1 - commits roll into trunk, no "5.X" tickets have landed yet, - in July we increment the version to 4.2, no release is made or announced, - commits continue to roll into trunk, still no "5.X" tickets have landed yet, - in October we increment the version to 4.3 - commits continue to roll into trunk, a "5.X" patch lands, trunk is incremented to 5.0 - in January 2022 we increment the value to 5.1, no release is made or announced, - commits continue to roll into trunk, - in April 2022 we formally release 5.1 and branch `cassandra-5.1` The cadence of those minor versions could be anything, quarterly, monthly or on-demand. This practice will force us to organise and automate dealing with version changes, creating our release branches, organising our test upgrade version paths. I'm gathering that process currently in CASSANDRA-16642. Jeremiah originally (and in more depth) illustrated this here: https://lists.apache.org/thread.html/r9b53342e6992cf98e8b95e763f63d19c798765be3bd86436f07afa8c%40%3Cdev.cassandra.apache.org%3E *** Concerns *** Addressing the questions and concerns that were previously raised. We have a problematic history with release versioning. This proposal is not tick-tock. It is about known best-practices around semver version numbers. This does not add the overhead of additional releases or release branches to the community. Long development cycles with only a (major) release every year will be an opposite force to our efforts to maintain an always shippable trunk. Semver, closer and more frequent feedback from the ecosystem, and better API awareness, all help us maintain an always shippable trunk. This was touched on by Benedict's "quarterly 'develop' releases" and by Benjamin's "bleeding edge snapshots where we do not guarantee stability". Individual features, new and old, still can be marked with their own maturity-state flag, e.g. experimental, unstable, stable, deprecated. This is all aside to semver, though it is part of, and feeds into, the API awareness. Deprecating and removing individual features should be easier too, as their lifecycle avoids being tied to the annual releases. "Our major/minor history has been a meaningless distinction". This proposal is an attempt to fix that. With better API awareness, and a way to appease the ecosystem getting what they need sooner, I believe it will help us better limit what we put into our patch releases. Could we cut releases off such quarterly minor releases but not maintain them? This was the general proposition in the previous thread, and while it is possible, and would leave such unsupported releases in an easy to download location with the ASF, it is left out for simplicity's sake. All downstreams can use the minor versions easily enough with or without a formal ASF compliant release. But it is something we can add in the future if called for and we have the bandwidth for. It could also be possible to better stage our development builds (using nexus, artifactory, etc). *** Summary *** I'm creating the cassandra-4.0 branch and will bump trunk to version "4.1" for now, until the discussion lands… I'm sure there's other concerns and suggestions. I can also write this up as a CEP if that's called for. [1] https://lists.apache.org/thread.html/re15543b55e5d01245ad75f7ec35af97e9895d37c01562eab31963dd4%40%3Cdev.cassandra.apache.org%3E [2] https://lists.apache.org/thread.html/r611316edc1c6b8d331994b4625c1a4d52ae5d5aee0bf4a158b2618ba%40%3Cdev.cassandra.apache.org%3E --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org