Hi Jean,

Thanks for your investigation. It looks that the problem is the same, and
it probably can be solved using a workaround with configuration-managed
store-and-forward queues.

By the way, we were also running one big central Artemis cluster for all
applications. I did not believe it when some IBM expert told us three years
ago that one big central message broker is an anti-pattern. Now I began to
understand. We had some problems with 6-node Artemis cluster topology and
decided to downsize and split it into several smaller clusters. It looks
like it is easier to support one central cluster, but it isn't - when
something happens, it affects all clients. Because Artemis is not that
stable, we need to make the impact more distributed into less areas.

Several months ago it was a 6 node cluster. After removing nodes 5-6, it
did not suffer in performance. The notification messages and redistribution
of messages between nodes generate excessive traffic, and it has decreased
from 90-110% to just 55-70% overhead. We also had many other problems which
can be eliminated by downsizing the cluster to 1 primary / 1 backup. We
plan to move forward and also remove nodes 3-4. We also created several
dedicated clusters for some high load topics and mission-critical
applications. All clusters are configuration managed. Each cluster has its
own git repository with ansible inventory and settings to make it easier to
support. Currently I'm writing a script which generates a repository for a
new cluster from a template, servers are created using terraform, and
everything is deployed on the servers by pipeline.


Hi Alexander,
>
> I am currently investigating the exact same issue.
> If you are interested, I have created an Artemis issue about it where you
> can find my analysis of the problem:
>
> https://issues.apache.org/jira/browse/ARTEMIS-5086
>
> I'm also curious to know if it is possible to pre-create cluster sf queues
> as a workaround for this issue, it could be a good idea.
>
>
> Regards
>
> Jean-Pascal
>
>
> On Thu, Nov 21, 2024 at 10:41 AM Alexander Milovidov <milovid...@gmail.com
> >
> wrote:
>
> > Hi All!
> >
> > We have Artemis cluster with two primary / backups, and it worked
> normally
> > before. Suddenly, the cluster queue was undeployed on one of the cluster
> > nodes during reload of the broker configuration. There was a log message
> > with event id AMQ224077 Undeploying queue
> > $.artemis.internal.sf.cluster-name.cluster-node-uuid.
> >
> > After this queue was undeployed, the messages which were routed to other
> > cluster node were unrouted and discarded.
> >
> > There are no address settings like autoDeleteQueues,
> > autoDeleteCreatedQueues, configDeleteQueues etc. I wonder how could this
> > happen.
> > The cluster queue was recreated after restart of the cluster connector.
> >
> > I don't know the root cause of the problem and we would like to prevent
> > this situation in the future because it leads to message loss. Is it ok
> to
> > make cluster addresses and queues to be configuration-managed on both
> > cluster nodes?
> >
> > ActiveMQ Artemis version is 2.37.0.
> >
> > --
> > Regards,
> > Alexander
> >
>
-- 
Regards,
Alexander Milovidov

Reply via email to