Re: [artemis] clustering & msg redistribution design questions

Justin Bertram Mon, 24 Jun 2019 08:07:49 -0700

 I think there's a handful of different ways to address this...

In general, it seems like your consumption isn't keeping up with your
production otherwise you wouldn't have such a large build-up of messages on
one of the brokers. It's a good idea to balance message production with
adequate consumption to keep the number of messages on the broker as a low
as possible. Obviously this can't always be done which is why solutions
like paging exist. However, the ideal situation is low message accumulation
in the broker. With that in mind, I recommend you explore flow control for
your producers. If there isn't such a large build-up of messages then
redistribution will be a much smaller problem (if a problem at all).


Another option would be to have master/slave pairs rather than individual
cluster nodes. That way in the case of a failure consumers fail-over to the
slave and stay relatively balanced across the two nodes rather than
accumulating on a single node. You could even go as far as setting the
message-load-balancing to STRICT to avoid redistribution altogether.

You could also increase the size of your cluster so that redistribution
happens to two nodes rather than just one. That should theoretically cut
the relative burden on each node in half when using 3 nodes vs. 2.


Justin

On Tue, Jun 18, 2019 at 4:22 PM Dan Langford <danlangf...@gmail.com> wrote:

> we are using Artemis 2.8.1 and we have 2 nodes in a cluster (Jgroup, TCP
> ping, load balancing=On Demand). we build each queue and address on each
> node and put address settings and security settings on each node (via the
> jolokia http api). the two nodes are behind a single vip so each incoming
> connection doesnt know which node it will be assigned.
>
> a producer can connect to NodeA and send a fair number of messages. maybe
> 24 million. If NodeA goes down for whatever reason (memory or disk
> problems, or scheduled OS patching) the  consumers on NodeA will  be
> disconnected. As they try to reconnect the vip will direct them all to the
> other available node, NodeB. when NodeA comes back online it notices all
> the consumers over on NodeB and redistributes all the messages in their
> queues.
>
> That can cause NodeA to take a long time and a lot of memory to start. It
> also causes the cluster/redistribution queue to become very deep and it can
> take many hours for them to all get redistributed over to NodeB. If NodeB
> has any problems as a result of the onslaught of messages and becomes
> unavailable or goes down then all the consumers will be disconnected, they
> will reconnect and connect to NodeA and start the problem all over.
>
> What advice would you have for us? is there a better cluster/ha design we
> could go with that would allow messages to redistribute across a cluster
> but also not bottleneck the cluster/redistribution queue on startup? we
> considered one time using backups that would become live and serve those
> messages immediately but ran into a lot of problems with the once stopped
> nodes failing to come up in a clean state. i can expound on that more if
> thats the direction i should be exploring.
>
> any insight you have is very much appreciated.
>

Re: [artemis] clustering & msg redistribution design questions

Reply via email to