I'm able to reproduce this issue in a much simpler way. Instead of starting
up brokers from within my application, I am using a clustered activemq setup
with 2 instances as follows. Both instances share the same data directory so
that only one broker is active at a time.

broker-1 has the following transport connector in activemq.xml:
<transportConnectors>
<transportConnector name="openwire" discoveryUri="multicast://default"
updateClusterClients="true" uri="tcp://0.0.0.0:61616"/>
</transportConnectors>

broker-2 has the following transport connector in activemq.xml:
<transportConnectors>
<transportConnector name="openwire" discoveryUri="multicast://default"
updateClusterClients="true" uri="tcp://0.0.0.0:61617"/>
</transportConnectors>

My J2EE application creates a connection with a broker url of
failover:(tcp://<host>:61616,tcp://<host>:61617)

The use case that fails is as follows:
1) I start broker-1 then broker-2. broker-1 is the master broker and holds
the lock on the data directory. I start my J2EE application and it creates a
connection to broker-1. I send 2 messages using that connection.
2) While the messages are being processed, I shut down broker-1.
3) broker-2 becomes the new master broker. Through failover transport, my
connection is now connected to broker-2 at port 61617.
4) The messages continue to be consumed after the connection switched over
to broker-2 but when the transactions are being committed,
TransactionContext#end hangs indefinitely.

Debugging this, I found that end leads to ResponseCorrelator#request which
send a TransactionInfo command. The TransactionInfo command is consumed and
creates a response command which is sent correctly. The problem seems to be
that this response command is never read in TcpTransport#doRun. Because of
this, ResponseCorrelator#request blocks when trying to return the
response.getResult(). 

Does anyone have any insight into this? This issue is blocking me from being
able to use master slave with shared directory so that I can ensure high
availability and immediate failover.



--
View this message in context: 
http://activemq.2283324.n4.nabble.com/Calling-end-on-TransactionContext-hangs-during-failover-when-using-master-slave-tp4720859p4721640.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to