Currently I am set up with ActiveMQ in a JDBC backed Master/Slave configuration. My MySQL databases are set up in a Master/Master configuration. The application servers (using embedded activeMQ brokers) are set to use a single MySQL instance and fail-over to the replicated master instance in case of failure.
Also note that I am running on EC2 instances, so my configuration needs to work in an environment where I can't guarantee the stability or lifetime of any one server. For example, I might bring up multiple application servers with embedded brokers to handle load over a few hours then turn them down again. Instances that run databases or application servers could die for various reasons at any time and be replaced with new systems. During this, I have to be able to guarantee that any messages passed to activeMQ get delivered (ie, I can't be losing messages due to instance failures, though delays could be acceptable in some cases). The problem I am running into is that when the MySQL instance that the application servers are using goes down for some reason (ie, I stop it), our jdbc/c3p0 pools fail-over to the other MySQL instance. ActiveMQ though appears to start flaking out at this point and doesn't handle this fail-over cleanly. Logs on first application server: 2011-02-04 18:17:04,968 INFO - JDBCPersistenceAdapter - No longer able to keep the exclusive lock so giving up being a master 2011-02-04 18:17:04,969 INFO - BrokerService - ActiveMQ Message Broker (localhost, ID:appserver1-43453-1296511712331-2:1) is shutting down 2011-02-04 18:17:05,978 WARN - FailoverTransport - Transport (/10.10.10.158:61616) failed to tcp://10.10.10.158:61616 , attempting to automatically reconnect due to: java.io.EOFException 2011-02-04 18:17:06,120 INFO - TransportConnector - Connector tcp://appserver1:61616 Stopped 2011-02-04 18:17:06,120 INFO - FailoverTransport - Successfully reconnected to tcp://10.10.10.28:61616 2011-02-04 18:17:06,143 INFO - PListStore - PListStore:activemq-data/localhost/tmp_storage stopped 2011-02-04 18:17:06,160 INFO - BrokerService - ActiveMQ JMS Message Broker (localhost, ID:appserver1-43453-1296511712331-2:1) stopped Logs on second application server: 2011-02-04 18:17:04,839 INFO - ansportServerThreadSupport - Listening for connections at: tcp://appserver2:61616 2011-02-04 18:17:04,839 INFO - TransportConnector - Connector tcp://appserver2:61616 Started 2011-02-04 18:17:04,840 INFO - BrokerService - ActiveMQ JMS Message Broker (localhost, ID:appserver2-41346-1296511718756-2:2) started 2011-02-04 18:17:04,840 INFO - AsyncBrokerStarter - ActiveMQ Broker restarted. 2011-02-04 18:17:06,036 WARN - FailoverTransport - Transport (/10.10.10.158:61616) failed to tcp://10.10.10.158:61616 , attempting to automatically reconnect due to: java.io.EOFException 2011-02-04 18:17:06,050 INFO - FailoverTransport - Successfully reconnected to tcp://10.10.10.28:61616 2011-02-04 18:17:07,501 WARN - JDBCPersistenceAdapter - Error while closing connection: Duplicate entry '1131306' for key 'PRIMARY' java.sql.BatchUpdateException: Duplicate entry '1131306' for key 'PRIMARY' at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2018) at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1449) at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeBatch(NewProxyPreparedStatement.java:1723) at org.apache.activemq.store.jdbc.TransactionContext.executeBatch(TransactionContext.java:98) at org.apache.activemq.store.jdbc.TransactionContext.executeBatch(TransactionContext.java:76) at org.apache.activemq.store.jdbc.TransactionContext.close(TransactionContext.java:124) at org.apache.activemq.store.jdbc.JDBCMessageStore.addMessage(JDBCMessageStore.java:91) at org.apache.activemq.store.memory.MemoryTransactionStore.addMessage(MemoryTransactionStore.java:281) at org.apache.activemq.store.memory.MemoryTransactionStore$1.asyncAddQueueMessage(MemoryTransactionStore.java:138) at org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:670) at org.apache.activemq.broker.region.Queue.send(Queue.java:644) at org.apache.activemq.broker.region.AbstractRegion.send(AbstractRegion.java:365) at org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:518) at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129) at org.apache.activemq.broker.CompositeDestinationBroker.send(CompositeDestinationBroker.java:96) at org.apache.activemq.broker.TransactionBroker.send(TransactionBroker.java:227) at org.apache.activemq.broker.MutableBrokerFilter.send(MutableBrokerFilter.java:135) at org.apache.activemq.broker.TransportConnection.processMessage(TransportConnection.java:462) at org.apache.activemq.command.ActiveMQMessage.visit(ActiveMQMessage.java:677) at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:311) at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:185) at org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:69) at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:113) at org.apache.activemq.transport.InactivityMonitor.onCommand(InactivityMonitor.java:228) at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83) at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:220) at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) at java.lang.Thread.run(Thread.java:662) I have also seen it get into states where both brokers get into a state where neither are master and the only resolve is to kick the brokers (not using JMX at the moment, so this currently means kicking tomcat). It feels as though the brokers fail over to the secondary database server without knowing that they are indeed on a new database where the table lock doesn't exist and so there is some contention around who is really the master broker. Is activeMQ expected to work in this scenario where the JDBC store supports failover between multiple read/write databases, or am I playing with fire here? So, with that said I thought perhaps a network of brokers might be a better way to go. My problem here is that I can't guarantee that any one application server will be online forever and as I understand it, messages in the queue are handed to a broker that then "owns" that message and if the broker were to go away forever, that message would be lost. >From the docs; Note though that a store and forward network is not a solution for message HA; if a broker fails in a Store and Forward network, the messages owned by that broker remain inside the broker's persistent store until the broker comes back online. If you need HA of messages then you need to use Master/Slave described above. Configuration currently being used for Master/Slave JDBC: <bean id="dataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource"> <property name="driverClass" value="com.mysql.jdbc.Driver"/> <property name="jdbcUrl" value="jdbc:mysql://10.10.10.178,10.10.10.225/database?loadBalanceBlacklistTimeout=5000&loadBalanceStrategy=bestResponseTime&autoReconnectForPools=true&failOverReadOnly=false"/> <property name="user" value="username"/> <property name="password" value="password"/> <property name="initialPoolSize" value="12"/> <property name="minPoolSize" value="12"/> <property name="maxPoolSize" value="128"/> <property name="checkoutTimeout" value="20000"/> <property name="maxIdleTime" value="10800"/> <property name="acquireIncrement" value="6"/> <property name="acquireRetryAttempts" value="12"/> <property name="automaticTestTable" value="jdbc_pool_check"/> <property name="idleConnectionTestPeriod" value="3600"/> <property name="testConnectionOnCheckin" value="false"/> </bean> <!-- create an embedded ActiveMQ Broker --> <bean id="jdbcPersistenceAdapter" class="org.apache.activemq.store.jdbc.JDBCPersistenceAdapter"> <property name="dataSource" ref="dataSource"/> <property name="transactionIsolation" value="4" /> </bean> <bean id="amqBroker" class="org.apache.activemq.broker.BrokerService" scope="prototype"> <property name="useJmx" value="false"/> <property name="persistent" value="true"/> <property name="persistenceAdapter" ref="jdbcPersistenceAdapter"/> <property name="transportConnectorURIs"> <list> <value>tcp://10.10.10.28:61616</value> </list> </property> <property name="destinationPolicy"> <bean class="org.apache.activemq.broker.region.policy.PolicyMap"> <property name="policyEntries"> <list> <bean class="org.apache.activemq.broker.region.policy.PolicyEntry"> <property name="queue" value=">"/> <property name="deadLetterStrategy"> <bean class="org.apache.activemq.broker.region.policy.IndividualDeadLetterStrategy"> <property name="queuePrefix" value="DLQ."/> </bean> </property> </bean> </list> </property> </bean> </property> </bean> <bean id="jmsConnectionFactory" class="org.apache.activemq.pool.PooledConnectionFactory" destroy-method="stop"> <property name="idleTimeout" value="43200000" /> <property name="connectionFactory"> <bean class="org.apache.activemq.ActiveMQConnectionFactory"> <property name="brokerURL" value="failover:(tcp://10.10.10.28:61616,tcp://10.10.10.158:61616)?randomize=false" /> <property name="redeliveryPolicy"> <bean class="org.apache.activemq.RedeliveryPolicy"> <property name="maximumRedeliveries" value="20" /> <property name="initialRedeliveryDelay" value="2000" /> <property name="useExponentialBackOff" value="true"/> <property name="backOffMultiplier" value="2" /> </bean> </property> </bean> </property> </bean> So basically I am looking for suggestions as to how best to do this. As mentioned above, we are working in a cloud environment where we can't guarantee the reliability of any one instance so need to take into account instances disappearing at any time. We also need to be able to scale out as necessary and not worry about losing messages as we turn down transient instances. thanks in advance, -- Mike -- View this message in context: http://activemq.2283324.n4.nabble.com/High-availability-configuration-tp3260839p3260839.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.