Hi Jean, Thanks for your investigation. It looks that the problem is the same, and it probably can be solved using a workaround with configuration-managed store-and-forward queues.
By the way, we were also running one big central Artemis cluster for all applications. I did not believe it when some IBM expert told us three years ago that one big central message broker is an anti-pattern. Now I began to understand. We had some problems with 6-node Artemis cluster topology and decided to downsize and split it into several smaller clusters. It looks like it is easier to support one central cluster, but it isn't - when something happens, it affects all clients. Because Artemis is not that stable, we need to make the impact more distributed into less areas. Several months ago it was a 6 node cluster. After removing nodes 5-6, it did not suffer in performance. The notification messages and redistribution of messages between nodes generate excessive traffic, and it has decreased from 90-110% to just 55-70% overhead. We also had many other problems which can be eliminated by downsizing the cluster to 1 primary / 1 backup. We plan to move forward and also remove nodes 3-4. We also created several dedicated clusters for some high load topics and mission-critical applications. All clusters are configuration managed. Each cluster has its own git repository with ansible inventory and settings to make it easier to support. Currently I'm writing a script which generates a repository for a new cluster from a template, servers are created using terraform, and everything is deployed on the servers by pipeline. Hi Alexander, > > I am currently investigating the exact same issue. > If you are interested, I have created an Artemis issue about it where you > can find my analysis of the problem: > > https://issues.apache.org/jira/browse/ARTEMIS-5086 > > I'm also curious to know if it is possible to pre-create cluster sf queues > as a workaround for this issue, it could be a good idea. > > > Regards > > Jean-Pascal > > > On Thu, Nov 21, 2024 at 10:41 AM Alexander Milovidov <milovid...@gmail.com > > > wrote: > > > Hi All! > > > > We have Artemis cluster with two primary / backups, and it worked > normally > > before. Suddenly, the cluster queue was undeployed on one of the cluster > > nodes during reload of the broker configuration. There was a log message > > with event id AMQ224077 Undeploying queue > > $.artemis.internal.sf.cluster-name.cluster-node-uuid. > > > > After this queue was undeployed, the messages which were routed to other > > cluster node were unrouted and discarded. > > > > There are no address settings like autoDeleteQueues, > > autoDeleteCreatedQueues, configDeleteQueues etc. I wonder how could this > > happen. > > The cluster queue was recreated after restart of the cluster connector. > > > > I don't know the root cause of the problem and we would like to prevent > > this situation in the future because it leads to message loss. Is it ok > to > > make cluster addresses and queues to be configuration-managed on both > > cluster nodes? > > > > ActiveMQ Artemis version is 2.37.0. > > > > -- > > Regards, > > Alexander > > > -- Regards, Alexander Milovidov