After some more digging; this seems to occur only when there are two or more messages sent from a producer.
We have many pageviews that yield just one message in total. But sometimes, still a fairly large portion, of the pageviews send messages to two or more queues (and also sometimes multiple messages to those queues). When those messages arrive at the localhost-broker, all is well. But when they're forwarded to the central broker, sometimes they're not in the same order as they arrived. The messages that arrived last will then be suppressed. It appears removing the auditNetworkProducers=true solves this issue. But is it really a true solution? Or are we now suppressing potential other issues? Both the original addition of auditNetworkProducers and its removal (in an update) are discussed here: http://tmielke.blogspot.nl/2012/03/i-have-messages-on-queue-but-they-dont.html Best regards, Arjen On 2-10-2016 13:37, Arjen van der Meijden wrote: > Hi, > > We have a network of brokers using the multicast protocol. Some messages > on that network disappear. > > Our application is a large php website, where we use Stomp+activemq to > offload some of the work to allow asynchronous processing. Among that > are notifications similar to Facebook's, but also other queues. Those > queues vary from a few messages to over three million messages a day. > > In the network are 11 servers which we implicitly give certain roles: > - Store and forward > Each of our 9 webservers has a activemq instance. This allows saving > some network overhead for the non-reused connections of PHP's > stomp-client (i.e. connects to 'localhost') and provides some redundancy > and buffering in case the central node has a problem. > > - Central node > We have 2 of these of which one is 'active'. We simply connect the > consumers using the failover protocol to these nodes, with the 'active' > one being tried first. > > The nodes all run ActiveMQ 5.13.3 and use multicast for transport discovery. > > The flow for the messages in question is: > 1. 'Something' happens on the website (i.e. a user's post is quoted) > 2. The php-code produces a Stomp-message > 3. The message is sent to the activemq on 'localhost' of that webserver > 4. (Since there is no consumer there) that activemq forwards it to the > central node > 5. Consumed from that central node using a long running php process that > consumes and acks each message as it arrives (i.e. does not buffer) > > Since users started to report bugs about this, we added logging at many > levels. After a lot of digging it turns out those missing messages > coincide with log messages like this on the central node's activemq.log: > > 2016-09-28 17:47:23,637 | WARN | suppressing duplicate message send > [ID:panda-41468-1473163942165-83:18887261:-1:1:1] from network producer > with producerSequence [1] less than last stored: 2 | > org.apache.activemq.broker.ProducerBrokerExchange | ActiveMQ Transport: > tcp:///172.29.249.161:50026@61616 > ... > 2016-09-28 22:14:55,310 | WARN | suppressing duplicate message send > [ID:phobos-44763-1473074816679-26:18380536:-1:1:3] from network producer > with producerSequence [3] less than last stored: 5 | > org.apache.activemq.broker.ProducerBrokerExchange | ActiveMQ Transport: > tcp:///172.29.249.33:59068@61616 > > In total, this seems to happen about 3000 times a day. > > So if those are (considered) duplicate, where are the initial ones? And > if they're false positives, how do we prevent false positives while > keeping support for true positives? > > Below is a somewhat stripped down version of our <broker>-section (no > comments and debug stuff). All 11 servers have the same config (apart > from the brokerName). > > Best regards, > > Arjen > > > <broker xmlns="http://activemq.apache.org/schema/core" > brokerName="nestor" dataDirectory="${activemq.data}" > schedulerSupport="true"> > <destinationPolicy> > <policyMap> > <policyEntries> > <policyEntry topic=">" > > <pendingMessageLimitStrategy> > <constantPendingMessageLimitStrategy limit="1000"/> > </pendingMessageLimitStrategy> > </policyEntry> > <policyEntry queue=">" producerFlowControl="true" > memoryLimit="128mb" enableAudit="false"> > <networkBridgeFilterFactory> > <conditionalNetworkBridgeFilterFactory > replayWhenNoConsumers="true"/> > </networkBridgeFilterFactory> > </policyEntry> > </policyEntries> > </policyMap> > </destinationPolicy> > > <networkConnectors> > <networkConnector > uri="multicast://default?group=tweakersActiveMQProduction&prefetchSize=1" > /> > </networkConnectors> > > <persistenceAdapter> > <kahaDB directory="${activemq.data}/kahadb"/> > </persistenceAdapter> > > <transportConnectors> > <transportConnector name="openwire" > uri="tcp://0.0.0.0:61616?maximumConnections=1000&wireFormat.maxFrameSize=104857600" > discoveryUri="multicast://default?group=tweakersActiveMQProduction" > auditNetworkProducers="true"/> > <transportConnector name="stomp" > uri="stomp://0.0.0.0:61613?transport.closeAsync=false&maximumConnections=1000&wireFormat.maxFrameSize=104857600"/> > </transportConnectors> > </broker> > >