Hi Clark,

If you do reproduce the problem, it would be great.
For now, the only actions I can see to try fixing / targeting the problem
are the followings :
- try running our plateform using only one node (on the 4 availables) and
see what occurs
- try to upgrade to ACTIVEMQ 5.3.2.
- try changing the topology of our NoB
- collect the maximum of information... which can be huge enough ;-) (the
problem occurs after 3 days...)

Attached to this email, is a periodic thread dump of our application. I put
some comments on it. This file reflet the problem we are facing.

Concerning the wireFormat.maxInactivityDuration, what do you recommend ?
Increasing it ? Do we setup a too small value ?
Thanks for you help
denis

-> the file attached
http://old.nabble.com/file/p29136727/dm_debug_2010_07_02_08_28_19_light.txt
periodic_thread_dump.txt 




cobrien wrote:
> 
> denis,
> Your assumptions are correct when everything is functioning as expected.
> It appears to me that some external anomalous behavior causes a
> wireFormat.maxInactivityDuration to time out or perhaps an issue with the
> Subscriber.  I am going to try to reproduce the issue on my servers. 
> 
>  -Clark 
> 
> 
> www.ttmsolutions.com 
> ActiveMQ reference guide at 
> http://bit.ly/AMQRefGuide  
> 
> 
> 
> 
> dbrondy wrote:
>> 
>> 
>> Hi Cobrien, and thanks for you answer,
>> 
>> In my understanding, I set networkTTL equal to 1 because we need only one
>> hop to forward one produced message to a consumer, connected anywhere.
>> Let say's PubA sends on TEST topic using BROKER1, if Sub1, Sub2, Sub3 are
>> respectively connected to TEST Topic using BROKER2, BROKER3, BROKER4, the
>> 3 Subscribers will received it in only one hop.
>> Perhaps I'm wrong...
>> 
>> I see the option dynamicOnly=true but I didn't test it for now. In fact,
>> to be precise, this is our application which turn into OOM after 3 or 4
>> days of normal run.
>> 
>> In the first analysis, I thought our application was misbehaving but I
>> couldn't find any clue which could target the problem.
>> 
>> The fact is that prior to the crash, the ActiveMQTransport of my
>> application get reconnected periodically to other nodes... This is a
>> strange behavior.. And the ActiveMQ node logbook don't tell they faced a
>> network problem..
>> 
>> Could it be possible to get some important back flood messages while
>> switching the transport from one element to another one ? In such case,
>> our application would received an important flow of message...
>> 
>> Thanks for your feedback
>> denis
>> 
>> 
>> 
>> cobrien wrote:
>>> 
>>> denis, 
>>> if networkTTL="1",  without a consumer consuming messages on  BROKERn 
>>> messages will just accumulate.  I would try after  setting
>>> dynamicOnly=true on the network connectors. 
>>> 
>>>  -Clark 
>>> PS
>>> Which broker received the OOM error?
>>> 
>>> www.ttmsolutions.com 
>>> ActiveMQ reference guide at 
>>> http://bit.ly/AMQRefGuide  
>>> 
>>> 
>>> 
>>> 
>>> dbrondy wrote:
>>>> 
>>>> Hi everybody,
>>>> 
>>>> We are currently using ActiveMQ 5.2 application in our project and we
>>>> are glad to use this great app.
>>>> 
>>>> One of our java application is misbehaving while receiving and
>>>> producing message and I don't have a lot of clue to troubleshoot the
>>>> problem.
>>>> 
>>>> In fact, we are using 4 computers to run ActiveMQ broker. The following
>>>> configuration has been implemented :
>>>> 
>>>> BROKER1 : 
>>>> <broker useJmx="true" persistent="false" dataDirectory="data"
>>>> brokerName="activemq" xmlns="http://activemq.apache.org/schema/core";>
>>>> <networkConnectors>
>>>> <networkConnector name="NoB" networkTTL="1"
>>>> uri="static://(tcp://BROKER2:61616,tcp://BROKER3:61616,tcp://BROKER4:61616)"/>
>>>> </networkConnectors>
>>>> <transportConnectors>
>>>> <transportConnector uri="tcp://BROKER1:61616"/>
>>>> </transportConnectors>
>>>> <managementContext>
>>>> <managementContext connectorPort="1399"
>>>> jmxDomainName="org.apache.activemq"/>
>>>> </managementContext>
>>>> </broker>
>>>> 
>>>> BROKER2 : 
>>>> <broker useJmx="true" persistent="false" dataDirectory="data"
>>>> brokerName="activemq" xmlns="http://activemq.apache.org/schema/core";>
>>>> <networkConnectors>
>>>> <networkConnector name="NoB" networkTTL="1"
>>>> uri="static://(tcp://BROKER1:61616,tcp://BROKER3:61616,tcp://BROKER4:61616)"/>
>>>> </networkConnectors>
>>>> <transportConnectors>
>>>> <transportConnector uri="tcp://BROKER2:61616"/>
>>>> </transportConnectors>
>>>> <managementContext>
>>>> <managementContext connectorPort="1399"
>>>> jmxDomainName="org.apache.activemq"/>
>>>> </managementContext>
>>>> </broker>
>>>> 
>>>> BROKER3 : 
>>>> <broker useJmx="true" persistent="false" dataDirectory="data"
>>>> brokerName="activemq" xmlns="http://activemq.apache.org/schema/core";>
>>>> <networkConnectors>
>>>> <networkConnector name="NoB" networkTTL="1"
>>>> uri="static://(tcp://BROKER1:61616,tcp://BROKER2:61616,tcp://BROKER4:61616)"/>
>>>> </networkConnectors>
>>>> <transportConnectors>
>>>> <transportConnector uri="tcp://BROKER3:61616"/>
>>>> </transportConnectors>
>>>> <managementContext>
>>>> <managementContext connectorPort="1399"
>>>> jmxDomainName="org.apache.activemq"/>
>>>> </managementContext>
>>>> </broker>
>>>> 
>>>> BROKER4 : 
>>>> <broker useJmx="true" persistent="false" dataDirectory="data"
>>>> brokerName="activemq" xmlns="http://activemq.apache.org/schema/core";>
>>>> <networkConnectors>
>>>> <networkConnector name="NoB" networkTTL="1"
>>>> uri="static://(tcp://BROKER1:61616,tcp://BROKER2:61616,tcp://BROKER3:61616)"/>
>>>> </networkConnectors>
>>>> <transportConnectors>
>>>> <transportConnector uri="tcp://BROKER4:61616"/>
>>>> </transportConnectors>
>>>> <managementContext>
>>>> <managementContext connectorPort="1399"
>>>> jmxDomainName="org.apache.activemq"/>
>>>> </managementContext>
>>>> </broker>
>>>> 
>>>> Following is the illustrated topology :
>>>>  http://old.nabble.com/file/p29107245/topology.jpg 
>>>> 
>>>> All our applications use Topic. They publish and subscribe messages
>>>> using a TopicConnectionFactory defined as follow :
>>>> failover:(tcp://BROKER1:61616?connectionTimeout=2000&soTimeout=2000&wireFormat.maxInactivityDuration=2000,tcp://BROKER2:61616?connectionTimeout=2000&soTimeout=2000&wireFormat.maxInactivityDuration=2000,tcp://BROKER3:61616?connectionTimeout=2000&soTimeout=2000&wireFormat.maxInactivityDuration=2000,tcp://BROKER4:61616?connectionTimeout=2000&soTimeout=2000&wireFormat.maxInactivityDuration=2000)?jms.useAsyncSend=true&maxReconnectDelay=2000&backup=false&useExponentialBackOff=false&maxReconnectAttempts=2
>>>> 
>>>> The application causing troubles subscribes bytes messages on Topic A,
>>>> performs a dedicated processing internally and publishes object
>>>> messages on Topic B. The bytes messages posted on Topic A are created
>>>> by 6 or 7 publishers. Object messages published on Topic B are also
>>>> received by different consumers. All messages used for now are NON
>>>> PERSISTENT message and all subscritions are non durable with
>>>> AUTO-ACKNOWLEDGE mode.
>>>> 
>>>> After 3 or 4 days of normal work, the transport thread called "ActiveMQ
>>>> Transport: tcp://BROKER1/10.160.14.31:61616" starts been recreated and
>>>> connected on another broker element. More precisely, during 3 days, the
>>>> application used the BROKER1 for publish/subscribe and at a given time,
>>>> the transport thread get recreated passing randomly from one element to
>>>> all other BROKER element every 5 minutes (more or less). Remark : it
>>>> never came back connected on BROKER1. After couple of switching, I see
>>>> our application unable retaining a large amount of incoming messages
>>>> which cannot be treated in a timely maner. If we don't do anything, the
>>>> application will fails in JavaHeapSpace.
>>>> 
>>>> Could it be possible to get a kind of duplicated message flooding at
>>>> the time the transport get reconnected ? Is our BROKER configuration
>>>> suitable (topology and network connector definition) ? Does someone see
>>>> this problem already ?
>>>> 
>>>> I will really appreciate any clue, ideas or recommandation.
>>>> Tks in advance and thanks for all the great job you do.
>>>> denis
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Strange-behavior-using-failover-and-network-of-broker-tp29107245p29136727.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to