Hi Clark, If you do reproduce the problem, it would be great. For now, the only actions I can see to try fixing / targeting the problem are the followings : - try running our plateform using only one node (on the 4 availables) and see what occurs - try to upgrade to ACTIVEMQ 5.3.2. - try changing the topology of our NoB - collect the maximum of information... which can be huge enough ;-) (the problem occurs after 3 days...)
Attached to this email, is a periodic thread dump of our application. I put some comments on it. This file reflet the problem we are facing. Concerning the wireFormat.maxInactivityDuration, what do you recommend ? Increasing it ? Do we setup a too small value ? Thanks for you help denis -> the file attached http://old.nabble.com/file/p29136727/dm_debug_2010_07_02_08_28_19_light.txt periodic_thread_dump.txt cobrien wrote: > > denis, > Your assumptions are correct when everything is functioning as expected. > It appears to me that some external anomalous behavior causes a > wireFormat.maxInactivityDuration to time out or perhaps an issue with the > Subscriber. I am going to try to reproduce the issue on my servers. > > -Clark > > > www.ttmsolutions.com > ActiveMQ reference guide at > http://bit.ly/AMQRefGuide > > > > > dbrondy wrote: >> >> >> Hi Cobrien, and thanks for you answer, >> >> In my understanding, I set networkTTL equal to 1 because we need only one >> hop to forward one produced message to a consumer, connected anywhere. >> Let say's PubA sends on TEST topic using BROKER1, if Sub1, Sub2, Sub3 are >> respectively connected to TEST Topic using BROKER2, BROKER3, BROKER4, the >> 3 Subscribers will received it in only one hop. >> Perhaps I'm wrong... >> >> I see the option dynamicOnly=true but I didn't test it for now. In fact, >> to be precise, this is our application which turn into OOM after 3 or 4 >> days of normal run. >> >> In the first analysis, I thought our application was misbehaving but I >> couldn't find any clue which could target the problem. >> >> The fact is that prior to the crash, the ActiveMQTransport of my >> application get reconnected periodically to other nodes... This is a >> strange behavior.. And the ActiveMQ node logbook don't tell they faced a >> network problem.. >> >> Could it be possible to get some important back flood messages while >> switching the transport from one element to another one ? In such case, >> our application would received an important flow of message... >> >> Thanks for your feedback >> denis >> >> >> >> cobrien wrote: >>> >>> denis, >>> if networkTTL="1", without a consumer consuming messages on BROKERn >>> messages will just accumulate. I would try after setting >>> dynamicOnly=true on the network connectors. >>> >>> -Clark >>> PS >>> Which broker received the OOM error? >>> >>> www.ttmsolutions.com >>> ActiveMQ reference guide at >>> http://bit.ly/AMQRefGuide >>> >>> >>> >>> >>> dbrondy wrote: >>>> >>>> Hi everybody, >>>> >>>> We are currently using ActiveMQ 5.2 application in our project and we >>>> are glad to use this great app. >>>> >>>> One of our java application is misbehaving while receiving and >>>> producing message and I don't have a lot of clue to troubleshoot the >>>> problem. >>>> >>>> In fact, we are using 4 computers to run ActiveMQ broker. The following >>>> configuration has been implemented : >>>> >>>> BROKER1 : >>>> <broker useJmx="true" persistent="false" dataDirectory="data" >>>> brokerName="activemq" xmlns="http://activemq.apache.org/schema/core"> >>>> <networkConnectors> >>>> <networkConnector name="NoB" networkTTL="1" >>>> uri="static://(tcp://BROKER2:61616,tcp://BROKER3:61616,tcp://BROKER4:61616)"/> >>>> </networkConnectors> >>>> <transportConnectors> >>>> <transportConnector uri="tcp://BROKER1:61616"/> >>>> </transportConnectors> >>>> <managementContext> >>>> <managementContext connectorPort="1399" >>>> jmxDomainName="org.apache.activemq"/> >>>> </managementContext> >>>> </broker> >>>> >>>> BROKER2 : >>>> <broker useJmx="true" persistent="false" dataDirectory="data" >>>> brokerName="activemq" xmlns="http://activemq.apache.org/schema/core"> >>>> <networkConnectors> >>>> <networkConnector name="NoB" networkTTL="1" >>>> uri="static://(tcp://BROKER1:61616,tcp://BROKER3:61616,tcp://BROKER4:61616)"/> >>>> </networkConnectors> >>>> <transportConnectors> >>>> <transportConnector uri="tcp://BROKER2:61616"/> >>>> </transportConnectors> >>>> <managementContext> >>>> <managementContext connectorPort="1399" >>>> jmxDomainName="org.apache.activemq"/> >>>> </managementContext> >>>> </broker> >>>> >>>> BROKER3 : >>>> <broker useJmx="true" persistent="false" dataDirectory="data" >>>> brokerName="activemq" xmlns="http://activemq.apache.org/schema/core"> >>>> <networkConnectors> >>>> <networkConnector name="NoB" networkTTL="1" >>>> uri="static://(tcp://BROKER1:61616,tcp://BROKER2:61616,tcp://BROKER4:61616)"/> >>>> </networkConnectors> >>>> <transportConnectors> >>>> <transportConnector uri="tcp://BROKER3:61616"/> >>>> </transportConnectors> >>>> <managementContext> >>>> <managementContext connectorPort="1399" >>>> jmxDomainName="org.apache.activemq"/> >>>> </managementContext> >>>> </broker> >>>> >>>> BROKER4 : >>>> <broker useJmx="true" persistent="false" dataDirectory="data" >>>> brokerName="activemq" xmlns="http://activemq.apache.org/schema/core"> >>>> <networkConnectors> >>>> <networkConnector name="NoB" networkTTL="1" >>>> uri="static://(tcp://BROKER1:61616,tcp://BROKER2:61616,tcp://BROKER3:61616)"/> >>>> </networkConnectors> >>>> <transportConnectors> >>>> <transportConnector uri="tcp://BROKER4:61616"/> >>>> </transportConnectors> >>>> <managementContext> >>>> <managementContext connectorPort="1399" >>>> jmxDomainName="org.apache.activemq"/> >>>> </managementContext> >>>> </broker> >>>> >>>> Following is the illustrated topology : >>>> http://old.nabble.com/file/p29107245/topology.jpg >>>> >>>> All our applications use Topic. They publish and subscribe messages >>>> using a TopicConnectionFactory defined as follow : >>>> failover:(tcp://BROKER1:61616?connectionTimeout=2000&soTimeout=2000&wireFormat.maxInactivityDuration=2000,tcp://BROKER2:61616?connectionTimeout=2000&soTimeout=2000&wireFormat.maxInactivityDuration=2000,tcp://BROKER3:61616?connectionTimeout=2000&soTimeout=2000&wireFormat.maxInactivityDuration=2000,tcp://BROKER4:61616?connectionTimeout=2000&soTimeout=2000&wireFormat.maxInactivityDuration=2000)?jms.useAsyncSend=true&maxReconnectDelay=2000&backup=false&useExponentialBackOff=false&maxReconnectAttempts=2 >>>> >>>> The application causing troubles subscribes bytes messages on Topic A, >>>> performs a dedicated processing internally and publishes object >>>> messages on Topic B. The bytes messages posted on Topic A are created >>>> by 6 or 7 publishers. Object messages published on Topic B are also >>>> received by different consumers. All messages used for now are NON >>>> PERSISTENT message and all subscritions are non durable with >>>> AUTO-ACKNOWLEDGE mode. >>>> >>>> After 3 or 4 days of normal work, the transport thread called "ActiveMQ >>>> Transport: tcp://BROKER1/10.160.14.31:61616" starts been recreated and >>>> connected on another broker element. More precisely, during 3 days, the >>>> application used the BROKER1 for publish/subscribe and at a given time, >>>> the transport thread get recreated passing randomly from one element to >>>> all other BROKER element every 5 minutes (more or less). Remark : it >>>> never came back connected on BROKER1. After couple of switching, I see >>>> our application unable retaining a large amount of incoming messages >>>> which cannot be treated in a timely maner. If we don't do anything, the >>>> application will fails in JavaHeapSpace. >>>> >>>> Could it be possible to get a kind of duplicated message flooding at >>>> the time the transport get reconnected ? Is our BROKER configuration >>>> suitable (topology and network connector definition) ? Does someone see >>>> this problem already ? >>>> >>>> I will really appreciate any clue, ideas or recommandation. >>>> Tks in advance and thanks for all the great job you do. >>>> denis >>>> >>>> >>> >>> >> >> > > -- View this message in context: http://old.nabble.com/Strange-behavior-using-failover-and-network-of-broker-tp29107245p29136727.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.