Re: Potential message loss seen with HA topology in Artemis 2.6.2 on failback

Clebert Suconic Tue, 17 Jul 2018 14:50:26 -0700

Ha is about preserving the journals between failures.

When you read and send messages you may still have an failure during the
reading.  I would need to understand what you do in case of a failure with
your consumer and producer.


Retries on send and duplicate detection are key for your case.

You could also play with XA and a transaction manager.

On Tue, Jul 17, 2018 at 5:01 PM Neha Sareen <neha.sar...@oracle.com> wrote:

> Hi,
>
>
>
> We are setting up a cluster of 6 brokers using Artemis 2.6.2.
>
>
>
> The cluster has 3 groups.
>
> - Each group has one master, and one slave broker pair.
>
> - The HA uses replication.
>
> - Each master broker configuration has the flag 'check-for-live-server'
> set to true.
>
> - Each slave broker configuration has the flag 'allow-failback' set to
> true.
>
> - We use static connectors for allowing cluster topology discovery.
>
> - Each broker's static connector list includes the connectors to the other
> 5 servers in the cluster.
>
> - Each broker declares its acceptor.
>
> - Each broker exports its own connector information via the
> 'connector-ref' configuration element.
>
> - The acceptor and the connector URLs for each broker are identical with
> respect to the host and port information
>
>
>
> We have a standalone test application that creates producers and
>
> consumers to write messages and receive messages respectively using a
> transacted JMS session.
>
>
>
> > We are trying to execute an automatic failover test case followed by
> failback as follows:
>
> TestCase -1
>
> Step1: Master & Standby Alive
>
> Step2: Producer Send Message , say 9 messages
>
> Step3: Kill Master
>
> Step4: Producer Send Message , say another 9 messages
>
> Step5: Kill Standby
>
> Step6: Start Master
>
> Step7: Start Standby.
>
> What we see is that it sync with Master discarding its internal state ,
> and we are able to consume only 9 messages, leading to a loss of 9 messages
>
>
>
>
>
> Test Case - 2
>
> Step1: Master & Standby Alive
>
> Step2: Producer Send Message
>
> Step3: Kill Master
>
> Step4: Producer Send Message
>
> Step5: Kill Standby
>
> Step6: Start Standby ( it waits for Master )
>
> Step7: Start Master (Question does it wait for slave ??)
>
> Step8: Consume Message
>
>
>
> Can someone provide any insights here regarding the potential message loss?
>
> Also are there alternatives to a different topology we may use here to get
> around this issue?
>
>
>
> Thanks
>
> Neha
>
>
>
>
> --
Clebert Suconic

Re: Potential message loss seen with HA topology in Artemis 2.6.2 on failback

Reply via email to