Hello,

I have a replicating master-slave -pair of Artemis 2.5.0. It seems to mainly 
work as it should, but it started logging replication failures after having run 
for a week. I have included my broker configuration in [1]. There are also 
exceptions on the client side after the replication problems surfaced [2].  Why 
would replication suddenly start failing without an apparent reason? So far I 
have encountered issues with every HA strategy I have tried.

I started both nodes at 2018-05-08 and yesterday, 2018-05-14 node a, the 
master, logs (multiple times):

09:08:47,318 WARN  [org.apache.activemq.artemis.core.server] Error during 
message dispatch: java.lang.RuntimeException: 
ActiveMQIllegalStateException[errorType=ILLEGAL_STATE message=File 
126743097.msg has a null channel]
                             at 
org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl.getReadOnlyBodyBuffer(LargeServerMessageImpl.java:216)
 [artemis-server-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.protocol.openwire.OpenWireMessageConverter.toAMQMessage(OpenWireMessageConverter.java:516)
 [artemis-openwire-protocol-2.5.0.jar:]
                             at 
org.apache.activemq.artemis.core.protocol.openwire.OpenWireMessageConverter.createMessageDispatch(OpenWireMessageConverter.java:492)
 [artemis-openwire-protocol-2.5.0.jar:]
                             at 
org.apache.activemq.artemis.core.protocol.openwire.amq.AMQConsumer.handleDeliver(AMQConsumer.java:227)
 [artemis-openwire-protocol-2.5.0.jar:]
                             at 
org.apache.activemq.artemis.core.protocol.openwire.amq.AMQSession.sendMessage(AMQSession.java:312)
 [artemis-openwire-protocol-2.5.0.jar:]
                             at 
org.apache.activemq.artemis.core.server.impl.ServerConsumerImpl.deliverStandardMessage(ServerConsumerImpl.java:1099)
 [artemis-server-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.server.impl.ServerConsumerImpl.proceedDeliver(ServerConsumerImpl.java:462)
 [artemis-server-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.server.impl.QueueImpl.proceedDeliver(QueueImpl.java:2900)
 [artemis-server-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:2374)
 [artemis-server-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.server.impl.QueueImpl.access$2000(QueueImpl.java:105)
 [artemis-server-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:3173)
 [artemis-server-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
 [artemis-commons-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
 [artemis-commons-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
 [artemis-commons-2.5.0.jar:2.5.0]
                             at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[rt.jar:1.8.0_162]
                             at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[rt.jar:1.8.0_162]
                             at 
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
 [artemis-commons-2.5.0.jar:2.5.0]
Caused by: ActiveMQIllegalStateException[errorType=ILLEGAL_STATE message=File 
126743097.msg has a null channel]
                             at 
org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.read(NIOSequentialFile.java:163)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.read(NIOSequentialFile.java:155)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl.getReadOnlyBodyBuffer(LargeServerMessageImpl.java:213)
 [artemis-server-2.5.0.jar:2.5.0]
                             ... 16 more


And node b. the slave, logs (multiple times):

14:07:43,549 WARN  [org.apache.activemq.artemis.journal] AMQ142011: Error on 
reading compacting for JournalFileImpl: (activemq-data-115.amq id = 141, 
recordID = 141)
14:07:43,572 ERROR [org.apache.activemq.artemis.journal] AMQ144003: Error 
compacting: java.lang.Exception: Error on reading compacting for 
JournalFileImpl: (activemq-data-115.amq id = 141, recordID = 141)
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.compact(JournalImpl.java:1569)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl$14.run(JournalImpl.java:2111)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
 [artemis-commons-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
 [artemis-commons-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
 [artemis-commons-2.5.0.jar:2.5.0]
                             at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[rt.jar:1.8.0_152]
                             at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[rt.jar:1.8.0_152]
                             at 
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
 [artemis-commons-2.5.0.jar:2.5.0]
Caused by: java.lang.Exception: Inconsistency during compacting: RollbackRecord 
ID = 126646250 for an already rolled back transaction during compacting
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.readJournalFile(JournalImpl.java:761)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.compact(JournalImpl.java:1566)
 [artemis-journal-2.5.0.jar:2.5.0]
                             ... 7 more
Caused by: java.lang.IllegalStateException: Inconsistency during compacting: 
RollbackRecord ID = 126646250 for an already rolled back transaction during 
compacting
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalCompactor.onReadRollbackRecord(JournalCompactor.java:349)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.readJournalFile(JournalImpl.java:733)
 [artemis-journal-2.5.0.jar:2.5.0]
                             ... 8 more

Then when I restart node b, the master is unable to start replication at first:

08:58:28,580 WARN  [org.apache.activemq.artemis.journal] AMQ142011: Error on 
reading compacting for JournalFileImpl: (activemq-data-155.amq id = 155, 
recordID = 155)
08:58:28,607 ERROR [org.apache.activemq.artemis.journal] AMQ144003: Error 
compacting: java.lang.Exception: Error on reading compacting for 
JournalFileImpl: (activemq-data-155.amq id = 155, recordID = 155)
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.compact(JournalImpl.java:1569)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl$12.run(JournalImpl.java:1466)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
 [artemis-commons-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
 [artemis-commons-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
 [artemis-commons-2.5.0.jar:2.5.0]
                             at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[rt.jar:1.8.0_162]
                             at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[rt.jar:1.8.0_162]
                             at 
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
 [artemis-commons-2.5.0.jar:2.5.0]
Caused by: java.lang.Exception: Inconsistency during compacting: RollbackRecord 
ID = 133826263 for an already rolled back transaction during compacting
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.readJournalFile(JournalImpl.java:761)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.compact(JournalImpl.java:1566)
 [artemis-journal-2.5.0.jar:2.5.0]
                             ... 7 more
Caused by: java.lang.IllegalStateException: Inconsistency during compacting: 
RollbackRecord ID = 133826263 for an already rolled back transaction during 
compacting
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalCompactor.onReadRollbackRecord(JournalCompactor.java:349)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.readJournalFile(JournalImpl.java:733)
 [artemis-journal-2.5.0.jar:2.5.0]
                             ... 8 more

08:58:28,608 WARN  [org.apache.activemq.artemis.core.server] AMQ222013: Error 
when trying to start replication: java.lang.RuntimeException: Error during 
compact, look at the logs
                             at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.scheduleCompactAndBlock(JournalImpl.java:1481)
 [artemis-journal-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager.startReplication(JournalStorageManager.java:553)
 [artemis-server-2.5.0.jar:2.5.0]
                             at 
org.apache.activemq.artemis.core.server.impl.SharedNothingLiveActivation$2.run(SharedNothingLiveActivation.java:178)
 [artemis-server-2.5.0.jar:2.5.0]
                             at java.lang.Thread.run(Thread.java:748) 
[rt.jar:1.8.0_162]

The replication continues normally after I restart node b a second time.

[1] Sample broker configurations 
https://github.com/ilkkavi/activemq-artemis/blob/scaledown-issue/issues/IssueExample/src/main/resources/replication/
[2] Sample of client side errors (some details omitted):

23:59:23.939 [container-245] WARN  com.atomikos.jms.AbstractJmsProxy - Error 
delegating call to createQueue on JMS driver
javax.jms.JMSException: AMQ119014: Timed out after waiting 30,000 ms for 
response when sending packet 45
                             at 
org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:399)
                             at 
org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:318)
                             at 
org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.queueQuery(ActiveMQSessionContext.java:254)
                             at 
org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.queueQuery(ClientSessionImpl.java:344)
                             at 
org.apache.activemq.artemis.jms.client.ActiveMQSession.lookupQueue(ActiveMQSession.java:1064)
                             at 
org.apache.activemq.artemis.jms.client.ActiveMQSession.createQueue(ActiveMQSession.java:358)
...
Caused by: 
org.apache.activemq.artemis.api.core.ActiveMQConnectionTimedOutException: 
AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 45
                             ... 78 common frames omitted
23:59:23.939 [container-245] ERROR o.s.j.l.a.MessageListenerAdapter - Listener 
execution failed
org.springframework.jms.listener.adapter.ListenerExecutionFailedException: 
Listener method 'handleMessage' threw exception; nested exception is 
org.springframework.jms.UncategorizedJmsException: Uncategorized exception 
occurred during JMS processing; nested exception is javax.jms.JMSException: 
AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 45
                             at 
org.springframework.jms.listener.adapter.MessageListenerAdapter.invokeListenerMethod(MessageListenerAdapter.java:322)
                             at 
org.springframework.jms.listener.adapter.MessageListenerAdapter.onMessage(MessageListenerAdapter.java:243)
                             at java.lang.Thread.run(Thread.java:748)
Caused by: org.springframework.jms.UncategorizedJmsException: Uncategorized 
exception occurred during JMS processing; nested exception is 
javax.jms.JMSException: AMQ119014: Timed out after waiting 30,000 ms for 
response when sending packet 45

                             at 
org.springframework.jms.core.JmsTemplate.convertAndSend(JmsTemplate.java:658)
                             ...
                             ... 12 common frames omitted
Caused by: javax.jms.JMSException: AMQ119014: Timed out after waiting 30,000 ms 
for response when sending packet 45
                             at 
org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:399)
                             at 
org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:318)
                             at 
org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.queueQuery(ActiveMQSessionContext.java:254)
                             at 
org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.queueQuery(ClientSessionImpl.java:344)
                             at 
org.apache.activemq.artemis.jms.client.ActiveMQSession.lookupQueue(ActiveMQSession.java:1064)
                             at 
org.apache.activemq.artemis.jms.client.ActiveMQSession.createQueue(ActiveMQSession.java:358)
                             at 
sun.reflect.GeneratedMethodAccessor114.invoke(Unknown Source)
                             at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                             at java.lang.reflect.Method.invoke(Method.java:498)
                             at 
com.atomikos.jms.AtomikosJmsXaSessionProxy.invoke(AtomikosJmsXaSessionProxy.java:174)
                             at com.sun.proxy.$Proxy185.createQueue(Unknown 
Source)
                             at 
org.springframework.jms.support.destination.DynamicDestinationResolver.resolveQueue(DynamicDestinationResolver.java:84)
                             at 
org.springframework.jms.support.destination.DynamicDestinationResolver.resolveDestinationName(DynamicDestinationResolver.java:58)
                             at 
org.springframework.jms.support.destination.JmsDestinationAccessor.resolveDestinationName(JmsDestinationAccessor.java:114)
                             at 
org.springframework.jms.core.JmsTemplate.access$200(JmsTemplate.java:90)
                             at 
org.springframework.jms.core.JmsTemplate$4.doInJms(JmsTemplate.java:573)
                             at 
org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:484)
                             ... 61 common frames omitted
Caused by: 
org.apache.activemq.artemis.api.core.ActiveMQConnectionTimedOutException: 
AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 45
                             ... 78 common frames omitted
23:59:58.166 [Atomikos:2596]   WARN  o.a.activemq.artemis.core.client - 
AMQ212052: Packet PACKET(SessionQueueQueryResponseMessage_V2)[type=-7, 
channelID=12, packetObject=SessionQueueQueryResponseMessage_V2, 
address=jms.queue.queue.name, name=queue.name, consumerCount=1, 
filterString=null, durable=true, exists=true, temporary=false, messageCount=0, 
autoCreationEnabled=true] was answered out of sequence due to a previous server 
timeout and it's being ignored
java.lang.Exception: trace
                             at 
org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:384)
                             at 
org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:318)
                             at 
org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.xaEnd(ActiveMQSessionContext.java:383)
                             at 
org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.end(ClientSessionImpl.java:1173)
                             at 
com.atomikos.datasource.xa.XAResourceTransaction.suspend(XAResourceTransaction.java:253)
                             at 
com.atomikos.datasource.xa.XAResourceTransaction.rollback(XAResourceTransaction.java:448)
                             at 
com.atomikos.icatch.imp.RollbackMessage.send(RollbackMessage.java:47)
                             at 
com.atomikos.icatch.imp.RollbackMessage.send(RollbackMessage.java:20)
                             at 
com.atomikos.icatch.imp.PropagationMessage.submit(PropagationMessage.java:67)
                             at 
com.atomikos.icatch.imp.Propagator$PropagatorThread.run(Propagator.java:63)
                             at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                             at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                             at java.lang.Thread.run(Thread.java:748)

Reply via email to