Is there any consistency in which sensor brokers succeed and which fail, or is it random every time?
If you're able to download the source code for your version of ActiveMQ and attach a debugger to one of your sensor brokers, you could set breakpoints and watch the subscription request from broker X be processed. Topic.addSubscription() would be a good place for your first breakpoint when you're failing to add the subscription, and TopicRegion.doCleanup() would be a good place for a breakpoint when you're failing to close offline durable subscribers. Tim On Thu, Sep 18, 2014 at 8:42 AM, michael.hart <michael.h...@arcticwolf.com> wrote: > We are running ActiveMQ 5.10.0 with a network of brokers. For one topic > called 'results' we have one subscriber running on a single broker 'X' and > many (9 in staging, 24 in production and growing) brokers (let's call them > sensors) connected in with full duplex connections, each broker with a > producer connected and sending messages to that topic. Consumers and > producers are all connected using stomp. brokers are connected using > openwire. > > There are other queues and topics on the brokers, but let's focus on just > this one. > > Until yesterday, we were using non-durable consumers, and non-persistent > messaging. Yesterday we enabled durable consumers by setting the client-id > on the CONNECT header, and setting activemq.subscriptionName on the > SUBSCRIBE header. At first things seem to work, but very quickly the > brokers > on the sensors stopped forwarding messages and went into producer flow > control. > > Under normal runtime, when I run "/opt/activemq/bin/activemq-admin query > -QTopic=sensor-results" I should see one network connector section per > sensor broker. However once we enabled durable subscribers, only a few of > the sensor brokers would show up, even though they were actually connected > (tcp connection was active). No amount of restarts of anything fixed it. > The > number of sensor brokers connected/subscribed was transient, between 2 and > 8 > would be connected, usually around 4. > > Then it got worse. We rolled back our code change and went back to > non-durable subscribers, but things did not get better. We had > offlineDurableSubscriberTimeout and offlineDurableSubscriberTaskSchedule > set > to one minute, but the durable subscritions did not clear out. We > eventually > shutdown the entire network of brokers, deleted the kahadb directory on > every broker, and restarted everything. > > It appears to us that the subscription from single broker 'X' did not make > it to the sensor brokers, or only did sometimes, but then refused to clear > out. We don't know how to further debug this, and would appreciate any help > and suggestions. > > thanks > mike > > > > -- > View this message in context: > http://activemq.2283324.n4.nabble.com/durable-subscribers-causing-message-flow-to-halt-tp4685694.html > Sent from the ActiveMQ - User mailing list archive at Nabble.com. >