We have similar issues with 3.x repairs, and run manually as well as with Reaper. Can someone tell me, if I cannot get a table repaired because it is locking up a node, is it still possible to upgrade to 4.0? Jeff From: Jon Haddad <j...@rustyrazorblade.com> Reply-To: <user@cassandra.apache.org> Date: Tuesday, December 17, 2024 at 2:20 PM To: <user@cassandra.apache.org> Subject: Re: Cassandra 5 Upgrade - Storage Compatibility Modes I strongly suggest moving to 4.0 and to set up Reaper. Managing repairs yourself is a waste of time, and you're almost certainly not doing it optimally. Jon On Tue, Dec 17, 2024 at 12:40 PM Miguel Santos-Lopez <mlo...@ims.tech> wrote: We haven’t had the chance to upgrade to 4, let alone 5. Has there been a big chance wrt to repairs since the old days of 3.11? :-) In my experience the problems have been on one hand a performance & latency hit, but also a lack of flexibility in the tooling: often I had repairs failing and the only option I know of using plain nodetool is to restart again the repair. I ended up wrapping the call to nodetool in a bash script allowing only selected keyspaces and tables to be repaired. In this way I get a clear picture of what failed and can then do a reliable “resume” with very extra effort. I would also add the time it takes. Afaik you don’t want to run more than two repairs at the same time. Depending on the load and number of nodes it easily becomes a tedious task. My view might well be biased by running that old version on a less than optimal cluster -improved only a couple of weeks ago, so I still have to see how it translates to repairs. Miguel A. Santos Senior Platform Engineer e mlo...@ims.tech w ims.tech t +1 226 339 8357
Error! Filename not specified. Error! Filename not specified. Trak (Global Solutions) Limited, trading as IMS, is a company registered in England and Wales with company registration number 06944694 and registered address at Global House, Westmere Drive, Crewe Business Park, Crewe, Cheshire, CW1 6ZD. This email and any attachments to it may be confidential, may be legally privileged and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of the Trak Global Group. If you are not the intended recipient of this email, you must not take any action based upon its contents, nor copy or show it to anyone. Please contact the sender if you believe you have received this email in error. From: Josh McKenzie <jmcken...@apache.org> Sent: Tuesday, December 17, 2024 3:11:06 PM To: user@cassandra.apache.org <user@cassandra.apache.org> Subject: Re: Cassandra 5 Upgrade - Storage Compatibility Modes It's kind of a shame we don't have rolling restart functionality built in to the database / sidecar. I know we've discussed that in the past. +1 to Jon's question - clients (i.e. java driver, etc) should be able to handle disconnects gracefully and route to other coordinators leaving the application-facing symptom being a blip on latency. Are you seeing something else more painful, or is it more just not having the built-in tooling / instrumentation to make it a clean reproducible operation? On Tue, Dec 17, 2024, at 2:24 PM, Jon Haddad wrote: Just curious, why is a rolling restart difficult? Is it a tooling issue, stability, just overall fear of messing with things? You *should* be able to do a rolling restart without it being an issue. I look at this as a fundamental workflow that every C* operator should have available, and you should be able to do them without there being any concern. Jon On 2024/12/17 16:01:06 Paul Chandler wrote: > All, > > We are getting a lot of push back on the 3 stage process of going through the > three compatibility modes to upgrade to Cassandra 5. This basically means 3 > rolling restarts of a cluster, which will be difficult for some of our large > multi DC clusters. > > Having researched this, it looks like, if you are not going to create large > TTL’s, it would be possible to go straight from C*4 to C*5 with SCM NONE. > This seems to be the same as it would have been going from 4.0 -> 4.1 > > Is there any reason why this should not be done? Has anyone had experience of > upgrading in this way? > > Thanks > > Paul Chandler > >