Hi Jeff, Repair is not a prerequisite for upgrading from 3.x to 4.x (but it's always recommended to run as a continuous process). Repair is not supported between nodes
running different major versions, so it should be disabled during the upgrade. There are quite a few fixes for hung repair sessions and performance improvements in 4.x
as well. You may find that repair runs more smoothly for you after upgrading. – Scott On Dec 17, 2024, at 4:30 PM, Jeff Masud <jeff@deasil.works> wrote: We have
similar issues with 3.x repairs, and run manually as well as with Reaper. Can someone tell me, if I cannot get a table repaired because it is locking up a node, is it
still possible to upgrade to 4.0? Jeff From: Jon Haddad <j...@rustyrazorblade.com> Reply-To: <user@cassandra.apache.org> Date: Tuesday, December 17, 2024
at 2:20 PM To: <user@cassandra.apache.org> Subject: Re: Cassandra 5 Upgrade - Storage Compatibility Modes I strongly suggest moving to 4.0 and to set up Reaper.
Managing repairs yourself is a waste of time, and you're almost certainly not doing it optimally. Jon On Tue, Dec 17, 2024 at 12:40 PM Miguel Santos-Lopez
<mlo...@ims.tech> wrote: We haven’t had the chance to upgrade to 4, let alone 5. Has there been a big chance wrt to repairs since the old days of 3.11? :-) In my
experience the problems have been on one hand a performance & latency hit, but also a lack of flexibility in the tooling: often I had repairs failing and the only
option I know of using plain nodetool is to restart again the repair. I ended up wrapping the call to nodetool in a bash script allowing only selected keyspaces and
tables to be repaired. In this way I get a clear picture of what failed and can then do a reliable “resume” with very extra effort. I would also add the time it takes.
Afaik you don’t want to run more than two repairs at the same time. Depending on the load and number of nodes it easily becomes a tedious task. My view might well be
biased by running that old version on a less than optimal cluster -improved only a couple of weeks ago, so I still have to see how it translates to repairs. Miguel A.
Santos Senior Platform Engineer e mlo...@ims.tech w ims.tech t +1 226 339 8357 Error! Filename not specified. Error! Filename not specified. Trak (Global Solutions)
Limited, trading as IMS, is a company registered in England and Wales with company registration number 06944694 and registered address at Global House, Westmere Drive,
Crewe Business Park, Crewe, Cheshire, CW1 6ZD. This email and any attachments to it may be confidential, may be legally privileged and are intended solely for the use
of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of the Trak Global
Group. If you are not the intended recipient of this email, you must not take any action based upon its contents, nor copy or show it to anyone. Please contact the
sender if you believe you have received this email in error. From: Josh McKenzie < jmcken...@apache.org > Sent: Tuesday, December 17, 2024 3:11:06 PM To:
user@cassandra.apache.org < user@cassandra.apache.org > Subject: Re: Cassandra 5 Upgrade - Storage Compatibility Modes It's kind of a shame we don't have rolling
restart functionality built in to the database / sidecar. I know we've discussed that in the past. +1 to Jon's question - clients (i.e. java driver, etc) should be
able to handle disconnects gracefully and route to other coordinators leaving the application-facing symptom being a blip on latency. Are you seeing something else
more painful, or is it more just not having the built-in tooling / instrumentation to make it a clean reproducible operation? On Tue, Dec 17, 2024, at 2:24 PM, Jon
Haddad wrote: Just curious, why is a rolling restart difficult? Is it a tooling issue, stability, just overall fear of messing with things? You *should* be able to do
a rolling restart without it being an issue. I look at this as a fundamental workflow that every C* operator should have available, and you should be able to do them
without there being any concern. Jon On 2024/12/17 16:01:06 Paul Chandler wrote: > All, > > We are getting a lot of push back on the 3 stage process of going
through the three compatibility modes to upgrade to Cassandra 5. This basically means 3 rolling restarts of a cluster, which will be difficult for some of our large
multi DC clusters. > > Having researched this, it looks like, if you are not going to create large TTL’s, it would be possible to go straight from C*4 to C*5
with SCM NONE. This seems to be the same as it would have been going from 4.0 -> 4.1 > > Is there any reason why this should not be done? Has anyone had
experience of upgrading in this way? > > Thanks > > Paul Chandler > >