It's kind of a shame we don't have rolling restart functionality built in to the database / sidecar. I know we've discussed that in the past.
+1 to Jon's question - clients (i.e. java driver, etc) should be able to handle disconnects gracefully and route to other coordinators leaving the application-facing symptom being a blip on latency. Are you seeing something else more painful, or is it more just not having the built-in tooling / instrumentation to make it a clean reproducible operation? On Tue, Dec 17, 2024, at 2:24 PM, Jon Haddad wrote: > Just curious, why is a rolling restart difficult? Is it a tooling issue, > stability, just overall fear of messing with things? > > You *should* be able to do a rolling restart without it being an issue. I > look at this as a fundamental workflow that every C* operator should have > available, and you should be able to do them without there being any concern. > > Jon > > > On 2024/12/17 16:01:06 Paul Chandler wrote: > > All, > > > > We are getting a lot of push back on the 3 stage process of going through > > the three compatibility modes to upgrade to Cassandra 5. This basically > > means 3 rolling restarts of a cluster, which will be difficult for some of > > our large multi DC clusters. > > > > Having researched this, it looks like, if you are not going to create large > > TTL’s, it would be possible to go straight from C*4 to C*5 with SCM NONE. > > This seems to be the same as it would have been going from 4.0 -> 4.1 > > > > Is there any reason why this should not be done? Has anyone had experience > > of upgrading in this way? > > > > Thanks > > > > Paul Chandler > > > > >