Out of curiosity, does the behavior change if you just have a single broker (i.e. eliminate clustering)? Typically it's best to validate behavior on the simplest configuration and then add complexity in stages.
Justin On Thu, Jul 10, 2025 at 10:05 AM <maximilian.rie...@systema.com> wrote: > Hello Community, > In a failover configuration, we observed an issue where, after a temporary > unavailability of all brokers (e.g., due to a short network interruption), > the consumer failed to properly resume message consumption. Although the > brokers became fully available again, the consumer sometimes either > received only every second message for a period of time or, in some cases, > stopped receiving messages altogether. > *Setup* > > broker/client version: 2.41 > > Java 21 Two broker cluster with no persistence (broker.xml files are > provided in git repo) configured for high availability. > > > *Performed Test* > > - Setup Artemis brokers as cluster with no persistence on hostA and > hostB > - Run SimpleListenerCount.main() on hostC and > SimplePublisherCount.main() on hostD. > - The publisher will publish incrementing integers on the topic > count.topic and the listener will compare the received counter with the > last one recieved. It will print if it lost any messages (number is > missing) > - cut the connection from hostC (listener) to both brokers (hostA and > hostB) while the publisher (hostD) still publishes messages > - resume the network connection > - check the missing messages > > *recreate information* > > - build two jars (jar1 with main=SimpleListener.main() and jar2 with > main=SimplePublisher.main()) > - setup of brokers and clients as described > - we simulated the connection loss by disabling WLAN on hostC > - Set a broker URL either in your IDE in SimpleListenerCount.brokerUrl > and SimplePublisherCount.brokerUrl or as the first command line argument. > - In this case I used: > (tcp://hostA:6666,tcp://hostB:6666)?failoverAttempts=-1 > > *expected result* > > - as we not have persistence enabled we expect message loss while the > network is not reachable > - after the brokers are reachable again we expect they reconnect to > the brokers and receive messages without message loss > > *actual result* > > - message loss while not reachable as expected > - reconnected as expected > - in some cases received only every second message for some time then > no more message loss > - in some cases no messages received anymore > > *question* > > - is there an error in my configuration / code? > - did we expect the wrong results? > - > > I hope I described our issue understandibly. > > I also put together a git repo with code example and configuration files: > > https://github.com/MaximilianRieder/ArtemisFailoverMessageLoss/tree/main > > Kind regards > > Maximilian > ------------------------------ > > *Maximilian Rieder* > Software Engineer > > Phone: +49 941 / 7 83 92 84 > maximilian.rie...@systema.com > > www.systema.com > > [image: LinkedIn] <https://www.linkedin.com/company/systema-gmbh/>[image: > Facebook] <https://de-de.facebook.com/SYSTEMA.automation/>[image: XING] > <https://www.xing.com/pages/systemagmbh> > > SYSTEMA > Systementwicklung Dipl.-Inf. Manfred Austen GmbH > > Manfred-von-Ardenne-Ring 6 | 01099 Dresden > HRB 11256 Amtsgericht Dresden | USt.-ID DE 159 607 786 > Geschäftsführer: Manfred Austen, CEO und Dr. Ulf Martin, COO > > P Please check whether a printout of this e-mail is really necessary. >