Hi all,

We have an environment as follows:
  ActiveMQ 5.12.0 on 3 nodes using Zookeeper
  Zookeeper 3.4.6 on the same 3 nodes.
  Java 1.8
  RHEL Server 7.1

We can start up and verify that ActiveMQ failover is working by sending and
consuming messages from different machines while taking ActiveMQ nodes up
and down, and everything looks fine.

Then, after some indeterminate amount of time, things stop working and
jstack turns up this:

Found one Java-level deadlock:
=============================
"ActiveMQ BrokerService[activeMqBroker] Task-26":
  waiting to lock monitor 0x00007f4520004e68 (object 0x00000000d5cbfe80, a
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup),
  which is held by "ZooKeeper state change dispatcher thread"
"ZooKeeper state change dispatcher thread":
  waiting to lock monitor 0x00007f451c00ee38 (object 0x00000000d5cf1e80, a
org.apache.activemq.leveldb.replicated.MasterElector),
  which is held by "ActiveMQ BrokerService[activeMqBroker] Task-25"
"ActiveMQ BrokerService[activeMqBroker] Task-25":
  waiting to lock monitor 0x00007f4520004e68 (object 0x00000000d5cbfe80, a
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup),
  which is held by "ZooKeeper state change dispatcher thread"

Java stack information for the threads listed above:
===================================================
"ActiveMQ BrokerService[activeMqBroker] Task-26":
        at
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup.close(ZooKeeperGroup.scala:100)
        - waiting to lock <0x00000000d5cbfe80> (a
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup)
        at
org.apache.activemq.leveldb.replicated.ElectingLevelDBStore.doStop(ElectingLevelDBStore.scala:282)
        at org.apache.activemq.util.ServiceSupport.stop(ServiceSupport.java:71)
        at org.apache.activemq.util.ServiceStopper.stop(ServiceStopper.java:41)
        at org.apache.activemq.broker.BrokerService.stop(BrokerService.java:806)
        at
org.apache.activemq.xbean.XBeanBrokerService.stop(XBeanBrokerService.java:122)
        at
org.apache.activemq.leveldb.replicated.ElectingLevelDBStore$$anonfun$stop_master$2.apply$mcV$sp(ElectingLevelDBStore.scala:259)
        at 
org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:330)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
"ZooKeeper state change dispatcher thread":
        at
org.apache.activemq.leveldb.replicated.groups.ClusteredSingletonWatcher.changed_decoded(ClusteredSingleton.scala:155)
        - waiting to lock <0x00000000d5cf1e80> (a
org.apache.activemq.leveldb.replicated.MasterElector)
        at
org.apache.activemq.leveldb.replicated.groups.ClusteredSingletonWatcher$$anon$2.changed(ClusteredSingleton.scala:108)
        at
org.apache.activemq.leveldb.replicated.groups.ChangeListenerSupport$$anonfun$fireChanged$1$$anonfun$apply$mcV$sp$3.apply(ChangeListener.scala:89)
        at
org.apache.activemq.leveldb.replicated.groups.ChangeListenerSupport$$anonfun$fireChanged$1$$anonfun$apply$mcV$sp$3.apply(ChangeListener.scala:88)
        at scala.collection.immutable.List.foreach(List.scala:383)
        at
org.apache.activemq.leveldb.replicated.groups.ChangeListenerSupport$$anonfun$fireChanged$1.apply$mcV$sp(ChangeListener.scala:88)
        at
org.apache.activemq.leveldb.replicated.groups.ChangeListenerSupport$$anonfun$fireChanged$1.apply(ChangeListener.scala:88)
        at
org.apache.activemq.leveldb.replicated.groups.ChangeListenerSupport$$anonfun$fireChanged$1.apply(ChangeListener.scala:88)
        at
org.apache.activemq.leveldb.replicated.groups.ChangeListenerSupport$class.check_elapsed_time(ChangeListener.scala:97)
        at
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup.check_elapsed_time(ZooKeeperGroup.scala:73)
        at
org.apache.activemq.leveldb.replicated.groups.ChangeListenerSupport$class.fireChanged(ChangeListener.scala:87)
        at
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup.fireChanged(ZooKeeperGroup.scala:73)
        at
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup.org$apache$activemq$leveldb$replicated$groups$ZooKeeperGroup$$fire_cluster_change(ZooKeeperGroup.scala:182)
        at
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup$$anon$1.onEvents(ZooKeeperGroup.scala:90)
        at
org.linkedin.zookeeper.tracker.ZooKeeperTreeTracker.raiseEvents(ZooKeeperTreeTracker.java:402)
        at
org.linkedin.zookeeper.tracker.ZooKeeperTreeTracker.track(ZooKeeperTreeTracker.java:240)
        at
org.linkedin.zookeeper.tracker.ZooKeeperTreeTracker.track(ZooKeeperTreeTracker.java:228)
        at
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup.onConnected(ZooKeeperGroup.scala:124)
        - locked <0x00000000d5cbfe80> (a
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup)
        at
org.apache.activemq.leveldb.replicated.groups.ZKClient.callListeners(ZKClient.java:385)
        at
org.apache.activemq.leveldb.replicated.groups.ZKClient$StateChangeDispatcher.run(ZKClient.java:354)
"ActiveMQ BrokerService[activeMqBroker] Task-25":
        at
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup.update(ZooKeeperGroup.scala:143)
        - waiting to lock <0x00000000d5cbfe80> (a
org.apache.activemq.leveldb.replicated.groups.ZooKeeperGroup)
        at
org.apache.activemq.leveldb.replicated.groups.ClusteredSingleton.join(ClusteredSingleton.scala:212)
        - locked <0x00000000d5cf1e80> (a
org.apache.activemq.leveldb.replicated.MasterElector)
        at
org.apache.activemq.leveldb.replicated.MasterElector.update(MasterElector.scala:90)
        - locked <0x00000000d5cf1e80> (a
org.apache.activemq.leveldb.replicated.MasterElector)
        at
org.apache.activemq.leveldb.replicated.MasterElector$change_listener$.changed(MasterElector.scala:243)
        - locked <0x00000000d5cf1e80> (a
org.apache.activemq.leveldb.replicated.MasterElector)
        at
org.apache.activemq.leveldb.replicated.MasterElector$change_listener$$anonfun$changed$1.apply$mcV$sp(MasterElector.scala:191)
        - locked <0x00000000d5cf1e80> (a
org.apache.activemq.leveldb.replicated.MasterElector)
        at
org.apache.activemq.leveldb.replicated.ElectingLevelDBStore$$anonfun$stop_master$1.apply$mcV$sp(ElectingLevelDBStore.scala:252)
        at 
org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:330)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Found 1 deadlock.

For what it's worth, we're not sending a huge amount of data around.





--
View this message in context: 
http://activemq.2283324.n4.nabble.com/ActiveMQ-failover-Java-level-deadlock-tp4705128.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to