On 28 April 2015 at 14:09, Tim Bain <tb...@alumni.duke.edu> wrote: > On Apr 28, 2015 3:21 AM, "James Green" <james.mk.gr...@gmail.com> wrote: > > > > > > So to re-work my understanding, the pre-fetch buffer in the connected > > client is filled with messages and the broker's queue counters react as > if > > this was not the case (else we'd see 1,000 drop from the pending queue > > extremely quickly, and slowly get dequeued which we do not see). > > Pretty much right, though I wouldn't say the broker's counters react as if > this was not the case; rather, the broker's dispatch counter increases > immediately but the dequeue counter won't increase until the broker removes > the message, and that won't happen until the consumer acks it. Until that > happens, the message exists in both places and the counters reflect that. > > It sounds like you're observing the broker through the web console; there's > WAY more information available through the JMX beans and you'll understand > this better by watching them instead of the web console. So I highly > recommend firing up JConsole and looking at the JMX beans. >
We're going through a firewall which pretty much rules JMX out - plenty of bad experiences of that scenario. We tried with Hawt.io but it does not work properly against a remote broker. The queues appear but they cannot be browsed and the health tab is empty. Others report the same. I agree, I don't understand that, particularly because even if the broker > was so loaded down that you were hitting that timeout, I don't see how that > would result in a failed delivery attempt. Your receive() call would just > return null and Camel would just call receive() again and everything would > be fine. (This is exactly what happens when there aren't any messages on > the queue, and nothing bad happens then.) So my gut reaction is that the > timeout is a red herring and something else is going on. Have you switched > that setting between the two values while playing identical messages > (either generated or recorded) to be sure that that setting really is the > cause of this behavior? > We've not, mainly because we've not spent the time on recording the individual messages for individual playback. > > Also, when messages are failing, do all of them fail? If it's only some of > them, what's the common thread? > > We get bursts into the DLQ, which we've put down to possible machine loading issues at the time. Still nothing recorded since the time-out change. Those that were in the DLQ were consumed fine when we reconfigured the app to read from there so it's not individual messages that will always fail, either. James