I think we're talking about different things. > Yes, and Paul clarified that it wasn't (just) an issue of having to do rolling restarts, but the work involved in doing an upgrade. Were it only the case that the hardest part of doing an upgrade was the rolling restart...
>From several messages ago: > This basically means 3 rolling restarts of a cluster, which will be difficult for some of our large multi DC clusters. The discussion was specifically about rolling restarts and how storage compatibility mode requires them, which in this environment was described as difficult. The difficultly of rest of the process is irrelevant here, because it's the same regardless of how you approach storage compatibility mode. My point is that rolling restarts should not be difficult if you have the right automation, which you seem to agree with. Want to discuss the difficulty of upgrading in general? I'm all for improving it. It's just not what this thread is about. Jon On Wed, Dec 18, 2024 at 10:01 AM Eric Evans <john.eric.ev...@gmail.com> wrote: > > > On Wed, Dec 18, 2024 at 11:43 AM Jon Haddad <j...@rustyrazorblade.com> > wrote: > >> > We (Wikimedia) have had more (major) upgrades go wrong in some way, >> than right. Any significant upgrade is going to be weeks —if not months— >> in the making, with careful testing, a phased rollout, and a workable plan >> for rollback. We'd never entertain doing more than one at a time, it's >> just way too many moving parts. >> >> The question wasn't about why upgrades are hard, it was about why a >> rolling restart of the cluster is hard. They're different things. >> > > Yes, and Paul clarified that it wasn't (just) an issue of having to do > rolling restarts, but the work involved in doing an upgrade. Were it only > the case that the hardest part of doing an upgrade was the rolling > restart... > > -- > Eric Evans > john.eric.ev...@gmail.com >