Generally speaking I'd say the following is recommended: 1. Kill the primary (causing all clients to fail-over to the backup). 2. Upgrade the primary. 3. Restart the primary. If fail-back has been configured then all the clients will fail-over to the primary at this point. 4. Kill the backup. If fail-back has not been configured then all the clients will fail-over to the primary at this point. 5. Upgrade the backup. 6. Restart the backup.
Of course, there's a window here where you have a single-point-of-failure so if that's unacceptable then another strategy would be necessary. Regardless, it's worth noting that AMQ219010 errors are not necessarily unexpected even in an environment with HA. Justin On Wed, Mar 12, 2025 at 1:25 PM John Lilley <john.lil...@redpointglobal.com> wrote: > We run Artemis in HA configuration under Kubernetes, with a primary and > backup instance that have shared storage. Recently, we updated our pod > definitions from version 2.30 to 2.39. And... something went wrong. I > think what happened is one of the two pods upgraded first, and our clients > started getting AMQ219010: Connection is destroyed errors. Restarting both > Artemis pods made everyone happy again. > > It's too late to gather good post-mortem evidence, but this leads me to > ask, what is the recommended update procedure? Clearly, we can stop our > own services while Artemis is completely updated. But is there a > recommended interruption-free way to do it? > > Thanks > john > PLEASE NOTE: This e-mail from Redpoint Global Inc. ("Redpoint") is > confidential and is intended solely for the use of the individual(s) to > whom it is addressed. If you believe you received this e-mail in error, > please notify the sender immediately, delete the e-mail from your computer > and do not copy, print or disclose it to anyone else. If you properly > received this e-mail as a customer, partner or vendor of Redpoint, you > should maintain its contents in confidence subject to the terms and > conditions of your agreement(s) with Redpoint. >