On 04/24/2015 04:50 PM, Kevin Burton wrote: > I’ve been working 15 hour days for the last 2-3 weeks trying to resolve > this so if this is somewhat incoherent it’s probably due to lack of sleep > :-P > > I think we’re experiencing a bug in ActiveMQ which is VERY hard to > reproduce but happens regularly in our production setup. > > I can’t reproduce it in my test setup because it seems to require real > world data. Every time I try to do so everything works fine. > > It seems you have to have the following: > > - a large number of queues which need servicing ( > 1000) > - a fairly large number of connections (>2000) > - message selectors > - a queue that has a large number of messages (5000). > > I have my test code now reproducing it… > > Everything works FINE if we have just a few message. The problems arise > once the queue size grows at which point selectors don’t work. > > It seems like *early* connections win. If I create a connection to > ActiveMQ early, and keep it open, it will work. But new connections don’t > work.. Eventually, the existing connections will fail too. > > Basically, it works JUST FINE without message selectors. > > I KNOW it’s not my code because I’ve written a basic /simple consumer which > is literally just raw JMS and is < 50 lines of code. > > I also know my messages selectors should match. First. they do match some > percentage of the time. Second, when I consume without the message > selectors, it works. I have it print the message headers and I can confirm > that they should match. > > This also seems to get worse over time. The larger the queue, the less > chance messages will be serviced, eventually it will just lock up entirely. > > > There are no obvious errors in the ActiveMQ log. Just regarding queue GC. > > The box still has about 40% memory free. So I don’t think it has any issue > with memory. No OutOfMemoryErrors being logged. > > I think another way to debug this could be to restart activemq itself with > message tracing. Then try to get the queue to this state again, and try to > consume messages nd see what’s being logged while it’s failing. > > What’s frustrating here is that this is the 3rd ActiveMQ workaround I’ve > had to implement. > > the first was because LevelDB was very slow… (artificially slow it seems), > so then I decided to just use the memory store. But the memory store > doesn’t support priority, so instead, I implemented priority through JMS > selectors. But now JMS selectors don’t work. > > :-/ > This sounds a lot like the standard issue of having a deep queue and the message selector not being able to match because the maxPageSize value is limiting what the message cursor will page in. Have you tried upping the maxPageSize option? See: https://issues.apache.org/jira/browse/AMQ-2217
-- Tim Bish Sr Software Engineer | RedHat Inc. tim.b...@redhat.com | www.redhat.com twitter: @tabish121 blog: http://timbish.blogspot.com/