Re: [DISCUSS] Improving the operational safety and simplicity of in-place major version upgrades

Josh McKenzie Wed, 11 Jun 2025 06:54:45 -0700

+1 that this is something that would be very valuable to our users.

As to where it lives, my gut reaction is to have the coordination, stop/start, 
etc live inside the sidecar and lean on the DB for the durable distributed lock 
to make the process atomic. Ultimately I'd like to see us move to the sidecar 
as control plane (see CEP-1 
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652224>), 
so having this functionality inside the sidecar would fit nicely with some of 
the other things we discussed. i.e.:
 1. Stop nodes safely
 2. Start nodes safely
 3. Rolling restart a cluster safely w/respect to racks and availability
 4. Plan and execute an upgrade
> c) Performing capability limitation during an upgrade
+1 to this as well; we never did get to consensus re: the capabilities 
framework did we? Think that kind of fizzled / got derailed on whether we 
should use TCM as a durable delivery mechanism for capabilities or not 
(ponymail link 
<https://lists.apache.org/thread/9oztm7o1czky3tywp3mxskb529nrn2wv>).


On Wed, Jun 11, 2025, at 2:34 AM, Paulo Motta wrote:
> Hi,
> 
> One of the most important operational features of Cassandra is how easy it is 
> (or should be) to do an in-place upgrade. The in-place upgrade procedure 
> essentially consists of rolling-restarting the cluster while updating the jar 
> to the new version, while following additional upgrade instructions from 
> NEWS.txt. In practice, as new features are added and existing features are 
> extended, the upgrade procedure gets more complex, placing more burden on 
> operators to ensure a smooth upgrade process.
> 
> For example, updating the storage_compatibility_mode from 4.0 to 5.0 requires 
> 3 cluster-wide restarts[1]. Another example is that upgrading to Cassandra 
> 6.0 prohibits operations like schema changes, node replacement, bootstrap, 
> decommission, move, assassinate before all the nodes are migrated to CMS[2]. 
> I don't want to focus on these particular examples, this is just to 
> illustrate that a lot of manual steps and caution is required to perform 
> in-place upgrades safely and smoothly.
> 
> In order to improve this, I would like to propose extending Cassandra to 
> allow an operator to register an upgrade intent with the goals of:
> a) Tracking the upgrade progress in a system table
> b) Verifying the correctness and improving the safety of the upgrade process
> c) Performing capability limitation during an upgrade
> d) Perform pre and post upgrade actions automatically, when registered in the 
> upgrade plan by the operator
> 
> While there is upgrade awareness in the server, it is mostly reactive and 
> scattered across different modules (as far as I last seen). A potential side 
> goal of this effort is to centralize upgrade handling code from different 
> features in the same module, allowing different features to specify upgrade 
> pre/post actions/conditions more uniformly. This would allow for example, 
> developers to specify upgrade constraints via testable code instead of notes 
> in NEWS.txt, with the hope they will be read by a careful operator.
> The upgrade plan would be registered in a system table and tracked by an 
> upgrade manager module, that would prevent certain operations (ie. range 
> movements/schema changes) when an upgrade plan is active or emit 
> errors/warnings when anomalies are encountered. A few safety/usability 
> improvements can be enabled when the upgrade plan is registered in the 
> server, among others:
> a) A node could fail startup if it tries to start in a version different from 
> the one specified in the currently active upgrade plan.
> b) If a latency degradation or other SLO degradation is detected while an 
> upgrade plan is active, then warnings could be emitted allowing operators to 
> more easily detect upgrade issues. 
> c) When the upgrade is determined to be completed successfully, nodes can 
> coordinate running upgrade-sstables or other post-operations according to a 
> policy specified in the upgrade plan (ie. by rack/dc).
> 
> To give an example of what the API would look like, a user wishing upgrade to 
> upgrade a cluster from version 4.1 to 5.0 would register the upgrade intent 
> via an API, ie.: nodetool upgradeplan create --target 5.0.4 
> --disable-schema-changes --post-action upgrade-sstables --post-action 
> upgrade-storage-compatibility-mode. It would not be possible to create 
> another upgrade plan if there's a current in progress.
> 
> The ultimate goal is that the upgrade process to any version will be as 
> simple as registering an upgrade plan, and performing a cluster rolling 
> restart in the desired target version. Any additional actions would be 
> autonomously coordinated by the servers based on the upgrade progress and 
> according to the preferences specified in the upgrade plan.
> 
> A related, and probably broader, topic is upgrading features. A couple of 
> examples that come to mind are upgrading Paxos[2] or migrating to incremental 
> repair[3]. Like version upgrades, these feature upgrades require a series of 
> steps to be executed on a determined order and sometimes global coordination. 
> While this suggestion focuses on version upgrades, it can potentially be 
> extended to track feature upgrades.
> I would appreciate your feedback on this draft suggestion to check if it 
> makes sense before elaborating it on a more detailed proposal, as well as 
> pointers to other efforts or past proposals that might be related to this.
> Thanks,
> Paulo
> 
> [1] - https://github.com/apache/cassandra/blob/trunk/NEWS.txt#L15-L21
> [2] - https://github.com/apache/cassandra/blob/trunk/NEWS.txt#L142C1-L148C19
> [3] - https://lists.apache.org/thread/06bl99mt502k7lowd5ont9jtnf5p0t05

Re: [DISCUSS] Improving the operational safety and simplicity of in-place major version upgrades

Reply via email to