Awesome, thanks for coming back and explaining what happened there. I was
super curious about this one.
I still think there's room for a unit test to document this and show that
the reference in the ProducerBrokerExchange still holds onto the "region"
destinations. This reference could be cleared at the end of a send since
it's never used again and would allow the broker to reclaim the destination
completely.

I'll write the test for that and change that reference!!
Tracked here:

https://issues.apache.org/jira/browse/AMQ-4222



On Thu, Dec 13, 2012 at 10:17 AM, Jerry Cwiklik <cwik...@us.ibm.com> wrote:

> Christian, after much pain and suffering I finally figured out what is
> going
> on. Our system is quite complicated and involves many producers that send
> large messages (600K-1.5M) to a relatively few multi-threaded consumers
> (services) which run "forever". The producers are transient and can be
> killed by our custom job scheduler at any time via kill -9 to make room for
> other producers. We run the broker with 10G heap.
>
> The consumer is coded to group and cache Sessions with a Connection which
> has an inactivity timer  associated with it. Every time a message is sent,
> the timer is restarted. If the timer pops (default 30minutes), the
> Sessions,
> MessageProducers and a Connection are closed due to inactivity.
>
> This worked perfectly fine until about 4 weeks ago when we started
> experiencing broker OOM problem. While the broker was running we could see
> a
> steady (fast) rise in the heap usage in a jConsole. After a couple of days
> the broker's jvm would OOM.
>
> The problem started happening when we introduced pingers for the Consumers.
> Every minute a pinger sends a message to a Consumer to make sure its alive.
> The Consumer replies to the pinger request and restarts inactivity timer.
> It
> took me awhile to see the bug in our application, but eventually I
> determined that our timer behaves incorrectly as it is associated with a
> Connection not individual Sessions. The Sessions go stale due to producer
> getting killed, and any messages in the broker referenced by
> ProducerExchange object are retained indefinitely causing a leak in the
> broker. As you explained it to me, the broker uses lazy approach to
> cleanup.
> Meaning it cleans up on a new message from the Producer. In our case, the
> Producer never sends anything and thus no cleanup is ever done.
>
> The fix for this is to create a timestamp for every Session when it was
> last
> used to message to the broker. At fixed intervals a Session Reaper thread
> wakes up and checks the timestamp of every Session to determine if it has
> been inactive for a max allowed time and if so, to close it.
>
> So the problem was caused by an application bug and the fact that the
> broker
> takes a lazy approach to cleanup. As a side note, under the described
> scenario, I've noticed that the broker memory usage (shown in jConsole)
> indicated 0 even though there were ton of messages in the heap with valid
> references (held by ProducerExchange).
>
> Thanks Christian for your help
>
> -Jerry C
>
>
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Broker-Leak-tp4660437p4660618.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



-- 
*Christian Posta*
http://www.christianposta.com/blog
twitter: @christianposta

Reply via email to