Our problem with using Oracle was that if the Active or Hot instance were to
become disconnected and with the changes made to Oracle to timeout the
connection and therefore release the lock on the database were to succeed,
we would indeed have a secondary or standby instance begin processing and
all is well until the previous instance again returns to the network and
what we are finding is that it will again create a session with Oracle and
will begin processing in parallel without attempting to gain a lock on the
DB. Now we have a problem of two instances of ActiveMQ are running.

Any advice on the best method? 

I see there have been some problems with persistence store corruption with
NFS as well.
http://old.nabble.com/Failover-and-Fail-BACK-td28198179.html#a28222719

Is ActiveMQ not ready for production enterprise networks or is there a
better method of implementing H.A.?


For Oracle, the  master instance of ActiveMQ obtains a lock the database
using a "select for update"  SQL statement. 
It appears that when you pull the plug, the data store does not detect the
stale connection in a  timely enough  fashion for your requirements. 
You can shorten the time needed to detect the stale connection by tuning the 
keepAlive TCP parameters ( OS specific) to meet your uptime requirements. 
When using oracle, setting  'ENABLE=BROKEN' in the TNS  ora  will enable 
use of the keepAlive packets.
Oracle also allows you to ping the client at regular intervals set by
sqlnet.expire_time (in minutes!). 


As always, do your testing in  an environment that   mimics your production
environment first. You may have to use trial and error  to find the right
settings for your OS and data store.

-- 
View this message in context: 
http://old.nabble.com/Noob-Questions---Fail-over---Redundancy-Help.-tp29057308p29090284.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to