https://activemq.apache.org/artemis/docs/latest/network-isolation.html

Sent from my iPhone

> On 22 Sep 2017, at 19:41, Michael André Pearce <michael.andre.pea...@me.com> 
> wrote:
> 
> I am assuming you had possibly a temp network fault meaning the slave and 
> master could not talk.
> 
> Have you configured network pinger? If / when you have network issues 
> possibly causing a split brain (master and slave cannot talk to each other) 
> then the nodes also ping another device on the network with the idea one 
> would fail, and thus help avoid the issue of this split brain scenario.
> 
> 
> Cheers
> Mike 
> 
> 
> Sent from my iPhone
> 
>> On 22 Sep 2017, at 17:49, boris_snp <boris.godu...@spglobal.com> wrote:
>> 
>> I have to restart my 2 broker cluster on a daily basis due to the following
>> sequence of events:
>> -----------------------------------------------------------------------------------------------
>> master
>> 04:51:14,501    AMQ212037: Connection failure has been detected: AMQ119014: 
>> Did
>> not receive data from /10.202.147.99:58739 within the 60,000ms connection
>> TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
>> 04:51:14,510    AMQ222092: Connection to the backup node failed, removing
>> replication now:
>> ActiveMQConnectionTimedOutException[errorType=CONNECTION_TIMEDOUT
>> message=AMQ119014: Did not receive data from /10.202.147.99:58739 within the
>> 60,000ms connection TTL. The connection will now be closed.]
>> 04:51:24,517    AMQ212041: Timed out waiting for netty channel to close
>> 04:51:24,517    AMQ212037: Connection failure has been detected: AMQ119014: 
>> Did
>> not receive data from /10.202.147.99:58738 within the 60,000ms connection
>> TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
>> -----------------------------------------------------------------------------------------------
>> slave
>> 04:51:42,306    
>> AMQ212037: Connection failure has been detected: AMQ119011: Did not receive
>> data from server for
>> org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnection@1c54a4bc[local=
>> /10.202.147.99:58738, remote=nj09mhf0681/10.202.147.99:41410]
>> [code=CONNECTION_TIMEDOUT]
>> 04:51:42,316    
>> AMQ212037: Connection failure has been detected: AMQ119011: Did not receive
>> data from server for
>> org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnection@65ace922[local=
>> /10.202.147.99:58739, remote=nj09mhf0681/10.202.147.99:41410]
>> [code=CONNECTION_TIMEDOUT]
>> 04:51:46,955    AMQ221037:
>> ActiveMQServerImpl::serverUUID=7ffa29a0-7c48-11e7-9784-e83935127b09 to
>> become 'live'
>> 04:51:59,360    AMQ221014: 40% loaded
>> 04:52:01,854    AMQ221014: 81% loaded
>> 04:52:03,037    AMQ222028: Could not find page cache for page 
>> PagePositionImpl
>> [pageNr=8, messageNr=-1, recordID=8662153341] removing it from the journal
>> 04:52:03,051    AMQ222028: Could not find page cache for page 
>> PagePositionImpl
>> [pageNr=13, messageNr=-1, recordID=8662204094] removing it from the journal
>> 04:52:03,208    AMQ221003: Deploying queue jms.queue.DLQ
>> 04:52:03,281    AMQ221003: Deploying queue jms.queue.ExpiryQueue
>> 04:52:03,827    AMQ212034: There are more than one servers on the network
>> broadcasting the same node id.
>> -----------------------------------------------------------------------------------------------
>> master
>> 04:52:03,827    AMQ212034: There are more than one servers on the network
>> broadcasting the same node id.
>> -----------------------------------------------------------------------------------------------
>> slave
>> 04:52:03,910    AMQ221007: Server is now live
>> 04:52:04,003    AMQ221020: Started Acceptor at nj09mhf0681:41411 for 
>> protocols
>> [CORE,MQTT,AMQP,STOMP,HORNETQ,OPENWIRE]
>> 04:52:11,949    AMQ212034: There are more than one servers on the network
>> broadcasting the same node id.
>> -----------------------------------------------------------------------------------------------
>> I understand that at some point master (now live) loses slave and closes
>> connection to it.
>> Slave (backup now) in turn detects that master is not present and becomes
>> live. Now both brokers are live and never recover to normal until restart.
>> How can I avois this? Will appreciate any help.
>> Thank you.
>> 
>> 
>> 
>> --
>> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Reply via email to