Jerry, Thanks again for demonstrating this. It's been fixed in the trunk.
https://issues.apache.org/jira/browse/AMQ-4222 Thanks, Christian On Thu, Dec 13, 2012 at 11:17 AM, Christian Posta <christian.po...@gmail.com > wrote: > Awesome, thanks for coming back and explaining what happened there. I was > super curious about this one. > I still think there's room for a unit test to document this and show that > the reference in the ProducerBrokerExchange still holds onto the "region" > destinations. This reference could be cleared at the end of a send since > it's never used again and would allow the broker to reclaim the destination > completely. > > I'll write the test for that and change that reference!! > Tracked here: > > https://issues.apache.org/jira/browse/AMQ-4222 > > > > On Thu, Dec 13, 2012 at 10:17 AM, Jerry Cwiklik <cwik...@us.ibm.com>wrote: > >> Christian, after much pain and suffering I finally figured out what is >> going >> on. Our system is quite complicated and involves many producers that send >> large messages (600K-1.5M) to a relatively few multi-threaded consumers >> (services) which run "forever". The producers are transient and can be >> killed by our custom job scheduler at any time via kill -9 to make room >> for >> other producers. We run the broker with 10G heap. >> >> The consumer is coded to group and cache Sessions with a Connection which >> has an inactivity timer associated with it. Every time a message is sent, >> the timer is restarted. If the timer pops (default 30minutes), the >> Sessions, >> MessageProducers and a Connection are closed due to inactivity. >> >> This worked perfectly fine until about 4 weeks ago when we started >> experiencing broker OOM problem. While the broker was running we could >> see a >> steady (fast) rise in the heap usage in a jConsole. After a couple of days >> the broker's jvm would OOM. >> >> The problem started happening when we introduced pingers for the >> Consumers. >> Every minute a pinger sends a message to a Consumer to make sure its >> alive. >> The Consumer replies to the pinger request and restarts inactivity timer. >> It >> took me awhile to see the bug in our application, but eventually I >> determined that our timer behaves incorrectly as it is associated with a >> Connection not individual Sessions. The Sessions go stale due to producer >> getting killed, and any messages in the broker referenced by >> ProducerExchange object are retained indefinitely causing a leak in the >> broker. As you explained it to me, the broker uses lazy approach to >> cleanup. >> Meaning it cleans up on a new message from the Producer. In our case, the >> Producer never sends anything and thus no cleanup is ever done. >> >> The fix for this is to create a timestamp for every Session when it was >> last >> used to message to the broker. At fixed intervals a Session Reaper thread >> wakes up and checks the timestamp of every Session to determine if it has >> been inactive for a max allowed time and if so, to close it. >> >> So the problem was caused by an application bug and the fact that the >> broker >> takes a lazy approach to cleanup. As a side note, under the described >> scenario, I've noticed that the broker memory usage (shown in jConsole) >> indicated 0 even though there were ton of messages in the heap with valid >> references (held by ProducerExchange). >> >> Thanks Christian for your help >> >> -Jerry C >> >> >> >> >> >> >> -- >> View this message in context: >> http://activemq.2283324.n4.nabble.com/Broker-Leak-tp4660437p4660618.html >> Sent from the ActiveMQ - User mailing list archive at Nabble.com. >> > > > > -- > *Christian Posta* > http://www.christianposta.com/blog > twitter: @christianposta > > -- *Christian Posta* http://www.christianposta.com/blog twitter: @christianposta