Re: Message Loss after Failover

Justin Bertram Fri, 11 Jul 2025 12:55:01 -0700

Out of curiosity, does the behavior change if you just have a single broker
(i.e. eliminate clustering)? Typically it's best to validate behavior on
the simplest configuration and then add complexity in stages.



Justin

On Thu, Jul 10, 2025 at 10:05 AM <maximilian.rie...@systema.com> wrote:

> Hello Community,
> In a failover configuration, we observed an issue where, after a temporary
> unavailability of all brokers (e.g., due to a short network interruption),
> the consumer failed to properly resume message consumption. Although the
> brokers became fully available again, the consumer sometimes either
> received only every second message for a period of time or, in some cases,
> stopped receiving messages altogether.
> *Setup*
>
> broker/client version: 2.41
>
> Java 21 Two broker cluster with no persistence (broker.xml files are
> provided in git repo) configured for high availability.
>
>
> *Performed Test*
>
>    - Setup Artemis brokers as cluster with no persistence on hostA and
>    hostB
>    - Run SimpleListenerCount.main() on hostC and
>    SimplePublisherCount.main() on hostD.
>    - The publisher will publish incrementing integers on the topic
>    count.topic and the listener will compare the received counter with the
>    last one recieved. It will print if it lost any messages (number is 
> missing)
>    - cut the connection from hostC (listener) to both brokers (hostA and
>    hostB) while the publisher (hostD) still publishes messages
>    - resume the network connection
>    - check the missing messages
>
> *recreate information*
>
>    - build two jars (jar1 with main=SimpleListener.main() and jar2 with
>    main=SimplePublisher.main())
>    - setup of brokers and clients as described
>    - we simulated the connection loss by disabling WLAN on hostC
>    - Set a broker URL either in your IDE in SimpleListenerCount.brokerUrl
>    and SimplePublisherCount.brokerUrl or as the first command line argument.
>       - In this case I used:
>       (tcp://hostA:6666,tcp://hostB:6666)?failoverAttempts=-1
>
> *expected result*
>
>    - as we not have persistence enabled we expect message loss while the
>    network is not reachable
>    - after the brokers are reachable again we expect they reconnect to
>    the brokers and receive messages without message loss
>
> *actual result*
>
>    - message loss while not reachable as expected
>    - reconnected as expected
>    - in some cases received only every second message for some time then
>    no more message loss
>    - in some cases no messages received anymore
>
> *question*
>
>    - is there an error in my configuration / code?
>    - did we expect the wrong results?
>    -
>
> I hope I described our issue understandibly.
>
> I also put together a git repo with code example and configuration files:
>
> https://github.com/MaximilianRieder/ArtemisFailoverMessageLoss/tree/main
>
> Kind regards
>
> Maximilian
> ------------------------------
>
> *Maximilian Rieder*
> Software Engineer
>
> Phone: +49 941 / 7 83 92 84
> maximilian.rie...@systema.com
>
> www.systema.com
>
> [image: LinkedIn] <https://www.linkedin.com/company/systema-gmbh/>[image:
> Facebook] <https://de-de.facebook.com/SYSTEMA.automation/>[image: XING]
> <https://www.xing.com/pages/systemagmbh>
>
> SYSTEMA
> Systementwicklung Dipl.-Inf. Manfred Austen GmbH
>
> Manfred-von-Ardenne-Ring 6 | 01099 Dresden
> HRB 11256 Amtsgericht Dresden | USt.-ID DE 159 607 786
> Geschäftsführer: Manfred Austen, CEO und Dr. Ulf Martin, COO
>
> P Please check whether a printout of this e-mail is really necessary.
>

Re: Message Loss after Failover

Reply via email to