Hi,
I've recently implemented an ActiveMQ installation using Zookeeper as
cluster manager.

Our setup is: 
12 contributors writing to two Virtual topics
3 Queues (each with 12 consumers)
Approx. 120 messages per second

The setup of activemq.xml is fairly default, here are a few changes;
   <broker xmlns="http://activemq.apache.org/schema/core";
brokerName="localhost" dataDirectory="${activemq.data}"
schedulePeriodForDestinationPurge="10000"> (to remove old queues)

                <policyEntry queue=">" gcInactiveDestinations="true"
inactiveTimoutBeforeGC="30000"/>

    <replicatedLevelDB
          directory="/opt/activemq/activemq-data"
          replicas="3"
          bind="tcp://0.0.0.0:0"
          zkAddress="amqserver1:2181,amqserver2:2181,amqserver3:2181"
          zkPassword="password"
          zkPath="/activemq/leveldb-stores"
          hostname="amqserver1"
          />

zoo.cfg has:

tickTime=8000    # this was increased as we saw some issues in our testing
when it was set to the default as network congestion could trigger a
failover event. 
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=amqserver1:2888:3888
server.2=amqserver2:2888:3888
server.3=amqserver3:2888:3888


The setup was working fine for about a week, but then we started seeing the
system fail from one cluster member to another, repeatedly every few
minutes. By shutting down some of the consumers (and having just one Virtual
Queue), the system has regained stability. CPU load, I/O activity was not
high before or during the failover event.

2015-12-12 01:33:25,784 | INFO  | Attaching... Downloaded 3887.83/3887.85 kb
and 3/4 files | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore |
hawtdispatch-DEFAULT-1
2015-12-12 01:33:25,786 | INFO  | Attaching... Downloaded 3887.85/3887.85 kb
and 4/4 files | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore |
hawtdispatch-DEFAULT-1
2015-12-12 01:33:25,788 | INFO  | Attached |
org.apache.activemq.leveldb.replicated.SlaveLevelDBStore |
hawtdispatch-DEFAULT-1
2015-12-12 02:07:32,737 | INFO  | Not enough cluster members have reported
their update positions yet. |
org.apache.activemq.leveldb.replicated.MasterElector | main-EventThread
2015-12-12 02:07:32,823 | INFO  | Slave stopped |
org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ
BrokerService[localhost] Task-4
2015-12-12 02:07:32,825 | INFO  | Not enough cluster members have reported
their update positions yet. |
org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ
BrokerService[localhost] Task-4
2015-12-12 02:07:32,832 | INFO  | Not enough cluster members have reported
their update positions yet. |
org.apache.activemq.leveldb.replicated.MasterElector | main-EventThread
2015-12-12 02:07:32,871 | INFO  | Promoted to master |
org.apache.activemq.leveldb.replicated.MasterElector | main-EventThread
2015-12-12 02:07:32,909 | INFO  | Using the pure java LevelDB
implementation. | org.apache.activemq.leveldb.LevelDBClient | ActiveMQ
BrokerService[localhost] Task-4
2015-12-12 02:07:36,060 | INFO  | Master started: tcp://amqserver2:55655 |
org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ
BrokerService[localhost] Task-5
2015-12-12 02:07:36,423 | INFO  | Slave has connected:
675aa794-3d5c-48f4-83ff-602777b8a53b |
org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
hawtdispatch-DEFAULT-2
2015-12-12 02:07:36,486 | INFO  | Slave has connected:
9bc02001-fc26-458a-8385-ac73f3ace8f0 |
org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
hawtdispatch-DEFAULT-2
2015-12-12 02:07:36,932 | INFO  | Slave has now caught up:
675aa794-3d5c-48f4-83ff-602777b8a53b |
org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
hawtdispatch-DEFAULT-2
2015-12-12 02:07:37,022 | INFO  | Slave has now caught up:
9bc02001-fc26-458a-8385-ac73f3ace8f0 |
org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
hawtdispatch-DEFAULT-2
2015-12-12 02:07:37,096 | INFO  | Installing Discarding Dead Letter Queue
broker plugin[dropAll=true; dropTemporaryTopics=true;
dropTemporaryQueues=true; dropOnly=null; reportInterval=1000] |
org.apache.activemq.plugin.DiscardingDLQBrokerPlugin | main
2015-12-12 02:07:37,310 | INFO  | Apache ActiveMQ 5.12.0 (localhost,
ID:amqserver2.emea.kuoni.int-44654-1449886056958-0:1) is starting |
org.apache.activemq.broker.BrokerService | main
2015-12-12 02:07:37,331 | INFO  | Listening for connections at:
tcp://amqserver2.emea.kuoni.int:61616?maximumConnections=1000&wireFormat.maxFrameSize=104857600
| org.apache.activemq.transport.TransportServerThreadSupport | main
2015-12-12 02:07:37,332 | INFO  | Connector openwire started |
org.apache.activemq.broker.TransportConnector | main
2015-12-12 02:07:37,335 | INFO  | Listening for connections at:
amqp://amqserver2.emea.kuoni.int:5672?maximumConnections=1000&wireFormat.maxFrameSize=104857600
| org.apache.activemq.transport.TransportServerThreadSupport | main
2015-12-12 02:07:37,336 | INFO  | Connector amqp started |
org.apache.activemq.broker.TransportConnector | main
2015-12-12 02:07:37,340 | INFO  | Listening for connections at:
stomp://amqserver2.emea.kuoni.int:61613?maximumConnections=1000&wireFormat.maxFrameSize=104857600
| org.apache.activemq.transport.TransportServerThreadSupport | main
2015-12-12 02:07:37,341 | INFO  | Connector stomp started |
org.apache.activemq.broker.TransportConnector | main
2015-12-12 02:07:37,344 | INFO  | Listening for connections at:
mqtt://amqserver2.emea.kuoni.int:1883?maximumConnections=1000&wireFormat.maxFrameSize=104857600
| org.apache.activemq.transport.TransportServerThreadSupport | main
2015-12-12 02:07:37,345 | INFO  | Connector mqtt started |
org.apache.activemq.broker.TransportConnector | main
2015-12-12 02:07:37,432 | INFO  | Listening for connections at
ws://amqserver2.emea.kuoni.int:61614?maximumConnections=1000&wireFormat.maxFrameSize=104857600
| org.apache.activemq.transport.ws.WSTransportServer | main
2015-12-12 02:07:37,438 | INFO  | Connector ws started |
org.apache.activemq.broker.TransportConnector | main
2015-12-12 02:07:37,439 | INFO  | Apache ActiveMQ 5.12.0 (localhost,
ID:amqserver2.emea.kuoni.int-44654-1449886056958-0:1) started |
org.apache.activemq.broker.BrokerService | main
2015-12-12 02:07:37,440 | INFO  | For help or more information please see:
http://activemq.apache.org | org.apache.activemq.broker.BrokerService | main
2015-12-12 02:07:37,774 | INFO  | ActiveMQ WebConsole available at
http://0.0.0.0:8161/ | org.apache.activemq.web.WebConsoleStarter | main
2015-12-12 02:07:37,775 | INFO  | ActiveMQ Jolokia REST API available at
http://0.0.0.0:8161/api/jolokia/ | org.apache.activemq.web.WebConsoleStarter
| main
2015-12-12 02:07:37,814 | INFO  | Initializing Spring FrameworkServlet
'dispatcher' | /admin | main
2015-12-12 02:07:38,054 | INFO  | jolokia-agent: No access restrictor found
at classpath:/jolokia-access.xml, access to all MBeans is allowed | /api |
main
2015-12-12 02:08:20,184 | INFO  | Stopping BrokerService[localhost] due to
exception, java.io.IOException |
org.apache.activemq.util.DefaultIOExceptionHandler | LevelDB IOException
handler.
java.io.IOException
        at
org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:39)[activemq-client-5.12.0.jar:5.12.0]
        at
org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:552)[activemq-leveldb-store-5.12.0.jar:5.12.0]
        at
org.apache.activemq.leveldb.LevelDBClient.might_fail_using_index(LevelDBClient.scala:1044)[activemq-leveldb-store-5.12.0.jar:5.12.0]
        at
org.apache.activemq.leveldb.LevelDBClient.store(LevelDBClient.scala:1390)[activemq-leveldb-store-5.12.0.jar:5.12.0]
        at
org.apache.activemq.leveldb.DBManager$$anonfun$drainFlushes$1.apply$mcV$sp(DBManager.scala:627)[activemq-leveldb-store-5.12.0.jar:5.12.0]
        at
org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:330)[hawtdispatch-scala-2.11-1.21.jar:1.21]
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_51]
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_51]
        at java.lang.Thread.run(Thread.java:744)[:1.7.0_51]
2015-12-12 02:08:20,189 | INFO  | Apache ActiveMQ 5.12.0 (localhost,
ID:amqserver2.emea.kuoni.int-44654-1449886056958-0:1) is shutting down |
org.apache.activemq.broker.BrokerService | IOExceptionHandler: stopping
BrokerService[localhost]
2015-12-12 02:08:20,238 | WARN  | Transport Connection to:
tcp://10.241.163.73:52172 failed: java.io.IOException: Unexpected error
occurred: org.apache.activemq.broker.BrokerStoppedException: Broker
BrokerService[localhost] is being stopped |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///10.241.163.73:52172@61616



Does this look like a bug? Or a misconfiguration somewhere? Or are the
servers under-resourced?

All three AMQ servers are specced the same - VMWare, Centos 6, 2 x vCPU, 8GB
RAM

Thanks, 

Damian



--
View this message in context: 
http://activemq.2283324.n4.nabble.com/ActiveMQ-Zookeeper-cluster-ping-ponging-tp4704920.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to