Hello, We have a problem with establishing a connection between the network of 3 brokers which work in active-active duplex solution. Whenever those three nodes are connected, the connection between the brokers drops from one of the nodes with some weird behaviour that we can see in the ActiveMQ Web Console and in the ActiveMQ logs. Every time I refresh the web console the amount of connected brokers changes. Sometimes we can see two connected brokers (Assuming that we are on broker A, I see the connection to broker B and C). And the other time, there is only one connection (to broker B). Sometimes, according to the web console, there is zero connections between the brokers. It can change with every refresh of the web console.
Each broker has a network connector with configuration like: <amq:networkConnectors> <amq:networkConnector name="${amq.connector.name}" userName="${amq.username}" password="${amq.password}" uri="${amq.broker.network.connector.uri}" networkTTL="2" duplex="true"/> </amq:networkConnectors> As you can see, each broker has it's own connector meaning that: Broker A has amq.broker.network.connector.uri=static:(tcp://brokerB: 61616,tcp://brokerC:61616) Broker B has amq.broker.network.connector.uri=static:(tcp://brokerA: 61616,tcp://brokerC:61616) Broker C has amq.broker.network.connector.uri=static:(tcp://brokerA: 61616,tcp://brokerB:61616) Do you think guys it might be a problem that the duplex=true is set on both sides of the connector? KahaDB configuration: <amq:kahaDB directory="${amq.database.dir}" journalMaxFileLength="${amq.journal.max.file.length}" checksumJournalFiles="true" checkForCorruptJournalFiles="true" cleanupInterval="5000" checkpointInterval="1000" useLock="false"> </amq:kahaDB> Connector configuration: <amq:transportConnectors> <amq:transportConnector name="Connector" uri="${amq.broker.connector. uri}"/> </amq:transportConnectors> where amq.broker.connector.uri = tcp://0.0.0.0:61616 At first, I have noticed that we had a bug in the broker configuration. All brokers had the same name (the name was like "broker") but I have already changed so each node is suffixed with the node number so the brokers are named: broker0, broker1, broker2. Is that a correct approach? Moreover, below you can see some interesting exceptions that we can see in the logs: 2019-09-13 09:02:45,804 ERROR [ActiveMQ BrokerService[brokerA] Task-3587] o.a.a.n.DemandForwardingBridgeSupport - Exception: org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://172.18.0.54:35706 on duplex forward of: ActiveMQTextMessage... 2019-09-13 09:03:16,840 TRACE [ActiveMQ Transport: tcp:///172.18.0.53:33396@61616] o.a.a.n.DemandForwardingBridgeSupport - serviceLocalException: disposed true ex org.apache.activemq.transport.TransportDisposedIOException: Disposed due to prior exception at org.apache.activemq.transport.ResponseCorrelator.onException( ResponseCorrelator.java:125) Caused by: java.io.EOFException: null at java.io.DataInputStream.readInt(DataInputStream.java:392) 2019-09-13 09:02:45,783 INFO [ActiveMQ BrokerService[brokerA] Task-3578] o.a.a.n.DemandForwardingBridgeSupport - Network connection between vm://brokerA#36640 and tcp:///172.18.0.54:35706@61616 shutdown due to a local error: {} java.net.SocketException: Connection reset at java.net.SocketOutputStream.socketWrite( SocketOutputStream.java:115) Sometimes we can also see correct log messages like: 2019-09-13 09:02:45,146 INFO [triggerStartAsyncNetworkBridgeCreation: remoteBroker=tcp:///172.18.0.54:35706@61616, localBroker= vm://brokerA#36640] o.a.a.n.DemandForwardingBridgeSupport - Network connection b etween vm://brokerA#36640 and tcp:///172.18.0.54:35706@61616 (brokerC) has been established. 2019-09-13 09:02:46,711 INFO [triggerStartAsyncNetworkBridgeCreation: remoteBroker=tcp:///172.18.0.53:33252@61616, localBroker= vm://brokerA#36644] o.a.a.n.DemandForwardingBridgeSupport - Network connection b etween vm://brokerA#36644 and tcp:///172.18.0.53:33252@61616 (brokeB) has been established. 172.18.0.53 is IP address of Broker B 172.18.0.54 is IP address of Broker C Also, we have noticed that the amount of messages on DLQ is going down below zero sometimes. (We don't use any policy to clean up the messages, we remove them through our consumer which prints the messages to the logs) What is more, the environment works properly with 2 nodes. Adding the third one makes it unstable. We use ActiveMQ 5.15.0 version with Java 8. My question is: Do you have any tips on how to approach that issue? Maybe you have encountered a similar problem in the past. I'd be glad for any tips that you can provide. Let me know also if you need further details or clarifications.