Ha is about preserving the journals between failures. When you read and send messages you may still have an failure during the reading. I would need to understand what you do in case of a failure with your consumer and producer.
Retries on send and duplicate detection are key for your case. You could also play with XA and a transaction manager. On Tue, Jul 17, 2018 at 5:01 PM Neha Sareen <neha.sar...@oracle.com> wrote: > Hi, > > > > We are setting up a cluster of 6 brokers using Artemis 2.6.2. > > > > The cluster has 3 groups. > > - Each group has one master, and one slave broker pair. > > - The HA uses replication. > > - Each master broker configuration has the flag 'check-for-live-server' > set to true. > > - Each slave broker configuration has the flag 'allow-failback' set to > true. > > - We use static connectors for allowing cluster topology discovery. > > - Each broker's static connector list includes the connectors to the other > 5 servers in the cluster. > > - Each broker declares its acceptor. > > - Each broker exports its own connector information via the > 'connector-ref' configuration element. > > - The acceptor and the connector URLs for each broker are identical with > respect to the host and port information > > > > We have a standalone test application that creates producers and > > consumers to write messages and receive messages respectively using a > transacted JMS session. > > > > > We are trying to execute an automatic failover test case followed by > failback as follows: > > TestCase -1 > > Step1: Master & Standby Alive > > Step2: Producer Send Message , say 9 messages > > Step3: Kill Master > > Step4: Producer Send Message , say another 9 messages > > Step5: Kill Standby > > Step6: Start Master > > Step7: Start Standby. > > What we see is that it sync with Master discarding its internal state , > and we are able to consume only 9 messages, leading to a loss of 9 messages > > > > > > Test Case - 2 > > Step1: Master & Standby Alive > > Step2: Producer Send Message > > Step3: Kill Master > > Step4: Producer Send Message > > Step5: Kill Standby > > Step6: Start Standby ( it waits for Master ) > > Step7: Start Master (Question does it wait for slave ??) > > Step8: Consume Message > > > > Can someone provide any insights here regarding the potential message loss? > > Also are there alternatives to a different topology we may use here to get > around this issue? > > > > Thanks > > Neha > > > > > -- Clebert Suconic