Hello,

We have a problem with establishing a connection between the network of 3
brokers which work in active-active duplex solution. Whenever those three
nodes are connected, the connection between the brokers drops from one of
the nodes with some weird behaviour that we can see in the ActiveMQ Web
Console and in the ActiveMQ logs.
Every time I refresh the web console the amount of connected brokers
changes.
Sometimes we can see two connected brokers (Assuming that we are on broker
A, I see the connection to broker B and C).
And the other time, there is only one connection (to broker B). Sometimes,
according to the web console, there is zero connections between the
brokers. It can change with every refresh of the web console.

Each broker has a network connector with configuration like:
<amq:networkConnectors>
    <amq:networkConnector name="${amq.connector.name}"
userName="${amq.username}" password="${amq.password}"
         uri="${amq.broker.network.connector.uri}" networkTTL="2"
duplex="true"/>
</amq:networkConnectors>

As you can see, each broker has it's own connector meaning that:
Broker A has amq.broker.network.connector.uri=static:(tcp://brokerB:
61616,tcp://brokerC:61616)
Broker B has amq.broker.network.connector.uri=static:(tcp://brokerA:
61616,tcp://brokerC:61616)
Broker C has amq.broker.network.connector.uri=static:(tcp://brokerA:
61616,tcp://brokerB:61616)

Do you think guys it might be a problem that the duplex=true is set on both
sides of the connector?

KahaDB configuration:
<amq:kahaDB directory="${amq.database.dir}"
        journalMaxFileLength="${amq.journal.max.file.length}"
        checksumJournalFiles="true"
        checkForCorruptJournalFiles="true"
        cleanupInterval="5000"
        checkpointInterval="1000"
        useLock="false">
</amq:kahaDB>

Connector configuration:
<amq:transportConnectors>
    <amq:transportConnector name="Connector" uri="${amq.broker.connector.
uri}"/>
</amq:transportConnectors>

where
amq.broker.connector.uri = tcp://0.0.0.0:61616

At first, I have noticed that we had a bug in the broker configuration. All
brokers had the same name (the name was like "broker") but I have already
changed so each node is suffixed with the node number so the brokers are
named: broker0, broker1, broker2. Is that a correct approach?

Moreover, below you can see some interesting exceptions that we can see in
the logs:

2019-09-13 09:02:45,804 ERROR [ActiveMQ BrokerService[brokerA] Task-3587]
o.a.a.n.DemandForwardingBridgeSupport - Exception:
org.apache.activemq.transport.InactivityIOException: Cannot send, channel
has already failed: tcp://172.18.0.54:35706 on duplex forward of:
ActiveMQTextMessage...

2019-09-13 09:03:16,840 TRACE [ActiveMQ Transport:
tcp:///172.18.0.53:33396@61616]
o.a.a.n.DemandForwardingBridgeSupport - serviceLocalException: disposed
true ex
org.apache.activemq.transport.TransportDisposedIOException: Disposed due to
prior exception
        at org.apache.activemq.transport.ResponseCorrelator.onException(
ResponseCorrelator.java:125)
Caused by: java.io.EOFException: null
        at java.io.DataInputStream.readInt(DataInputStream.java:392)

2019-09-13 09:02:45,783 INFO [ActiveMQ BrokerService[brokerA] Task-3578]
o.a.a.n.DemandForwardingBridgeSupport - Network connection between
vm://brokerA#36640 and tcp:///172.18.0.54:35706@61616 shutdown due to a
local error: {}
java.net.SocketException: Connection reset
        at java.net.SocketOutputStream.socketWrite(
SocketOutputStream.java:115)

Sometimes we can also see correct log messages like:

2019-09-13 09:02:45,146 INFO [triggerStartAsyncNetworkBridgeCreation:
remoteBroker=tcp:///172.18.0.54:35706@61616, localBroker=
vm://brokerA#36640] o.a.a.n.DemandForwardingBridgeSupport - Network
connection b
etween vm://brokerA#36640 and tcp:///172.18.0.54:35706@61616 (brokerC) has
been established.

2019-09-13 09:02:46,711 INFO [triggerStartAsyncNetworkBridgeCreation:
remoteBroker=tcp:///172.18.0.53:33252@61616, localBroker=
vm://brokerA#36644] o.a.a.n.DemandForwardingBridgeSupport - Network
connection b
etween vm://brokerA#36644 and tcp:///172.18.0.53:33252@61616 (brokeB) has
been established.

172.18.0.53 is IP address of Broker B
172.18.0.54 is IP address of Broker C

Also, we have noticed that the amount of messages on DLQ is going down
below zero sometimes. (We don't use any policy to clean up the messages, we
remove them through our consumer which prints the messages to the logs)

What is more, the environment works properly with 2 nodes. Adding the third
one makes it unstable.
We use ActiveMQ 5.15.0 version with Java 8.
My question is: Do you have any tips on how to approach that issue? Maybe
you have encountered a similar problem in the past. I'd be glad for any
tips that you can provide.
Let me know also if you need further details or clarifications.

Reply via email to