On Wed, Dec 18, 2024 at 11:50 AM Paul Chandler wrote:
>
> This is all old history and has been fixed, so is not really what the
> question was about, however these old problems have a bad legacy in the
> memory of the people that matter. Hence the push back we have now.
>
I totally understand th
OK, it seems like I didn’t explain it too well, but yes it is the rolling
restart 3 times as part of the upgrade that is causing the push back, my
message was a bit vague on the use cases because there are confidentiality
agreements in place so I can’t share too much.
We have had problems in th
On Wed, Dec 18, 2024 at 12:26 PM Jeff Jirsa wrote:
> I think this is one of those cases where if someone tells us they’re
> feeling pain, instead of telling them it shouldn’t be painful, we try to
> learn a bit more about the pain.
>
> For example, both you and Scott expressed surprise at the con
On Wed, Dec 18, 2024 at 12:12 PM Jon Haddad wrote:
> I think we're talking about different things.
>
> > Yes, and Paul clarified that it wasn't (just) an issue of having to do
> rolling restarts, but the work involved in doing an upgrade. Were it only
> the case that the hardest part of doing a
I think this is one of those cases where if someone tells us they’re feeling
pain, instead of telling them it shouldn’t be painful, we try to learn a bit
more about the pain.
For example, both you and Scott expressed surprise at the concern of rolling
restarts (you repeatedly, Scott mentioned t
Yeah, the issue with the yaml being out of sync is consistent with any
other JMX change, such as compaction throughput / threads, etc. You'd have
to deploy the config and apply the change via JMX otherwise you'd risk
restarting the node and running into an issue.
I think there's probably room for
I think we're talking about different things.
> Yes, and Paul clarified that it wasn't (just) an issue of having to do
rolling restarts, but the work involved in doing an upgrade. Were it only
the case that the hardest part of doing an upgrade was the rolling
restart...
>From several messages a
It's clear from discussion on this list that the current "storage_compatibility_mode" implementation and upgrade path for 5.0 is a source of real and legitimate user pain, and is
likely to result in many organizations slowing their adoption of the release. Would love to discuss on dev@ how we can
On Wed, Dec 18, 2024 at 11:43 AM Jon Haddad wrote:
> > We (Wikimedia) have had more (major) upgrades go wrong in some way, than
> right. Any significant upgrade is going to be weeks —if not months— in the
> making, with careful testing, a phased rollout, and a workable plan for
> rollback. We'd
> We (Wikimedia) have had more (major) upgrades go wrong in some way, than
right. Any significant upgrade is going to be weeks —if not months— in the
making, with careful testing, a phased rollout, and a workable plan for
rollback. We'd never entertain doing more than one at a time, it's just
way
On Tue, Dec 17, 2024 at 2:37 PM Paul Chandler wrote:
> It is a mixture of things really, firstly it is a legacy issue where there
> have been performance problems in the past during upgrades, these have now
> been fixed, but it is not easy to regain the trust in the process.
>
> Secondly there ar
The ability to move through the SCM via the nodetool would definitely help in
this situation. I can see there being an issue is the cassandra.yaml is not
changed, as the node could revert back to an older mode if the node is
restarted.
Would there be any other potential problems with exposing
get a table repaired because it is locking up a node, is it
still possible to upgrade to 4.0? Jeff From: Jon Haddad Reply-To: Date: Tuesday, December 17, 2024
at 2:20 PM To: Subject: Re: Cassandra 5 Upgrade - Storage Compatibility Modes I strongly suggest moving to 4.0 and to set up Reaper.
Man
To:
Subject: Re: Cassandra 5 Upgrade - Storage Compatibility Modes
I strongly suggest moving to 4.0 and to set up Reaper. Managing repairs
yourself is a waste of time, and you're almost certainly not doing it
optimally.
Jon
On Tue, Dec 17, 2024 at 12:40 PM Miguel Santos-Lopez wrot
nded recipient of this email, you must not take any
> action based upon its contents, nor copy or show it to anyone. Please
> contact the sender if you believe you have received this email in error.
>
> --
> *From:* Josh McKenzie
> *Sent:* Tuesday, December 17, 2024 3:11:06 PM
> Secondly there are some very large clusters involved, 1300+ nodes across
multiple physical datacenters, in this case any upgrades are only done out
of hours and only one datacenter per day. So a normal upgrade cycle will
take multiple weeks, and this one will take 3 times as long.
If you only r
Hi Jon,
It is a mixture of things really, firstly it is a legacy issue where there have
been performance problems in the past during upgrades, these have now been
fixed, but it is not easy to regain the trust in the process.
Secondly there are some very large clusters involved, 1300+ nodes acro
It's kind of a shame we don't have rolling restart functionality built in to
the database / sidecar. I know we've discussed that in the past.
+1 to Jon's question - clients (i.e. java driver, etc) should be able to handle
disconnects gracefully and route to other coordinators leaving the
applic
Just curious, why is a rolling restart difficult? Is it a tooling issue,
stability, just overall fear of messing with things?
You *should* be able to do a rolling restart without it being an issue. I look
at this as a fundamental workflow that every C* operator should have available,
and you
All,
We are getting a lot of push back on the 3 stage process of going through the
three compatibility modes to upgrade to Cassandra 5. This basically means 3
rolling restarts of a cluster, which will be difficult for some of our large
multi DC clusters.
Having researched this, it looks like,
20 matches
Mail list logo