Jerry,

Thanks again for demonstrating this.
It's been fixed in the trunk.

https://issues.apache.org/jira/browse/AMQ-4222

Thanks,
Christian


On Thu, Dec 13, 2012 at 11:17 AM, Christian Posta <christian.po...@gmail.com
> wrote:

> Awesome, thanks for coming back and explaining what happened there. I was
> super curious about this one.
> I still think there's room for a unit test to document this and show that
> the reference in the ProducerBrokerExchange still holds onto the "region"
> destinations. This reference could be cleared at the end of a send since
> it's never used again and would allow the broker to reclaim the destination
> completely.
>
> I'll write the test for that and change that reference!!
> Tracked here:
>
> https://issues.apache.org/jira/browse/AMQ-4222
>
>
>
> On Thu, Dec 13, 2012 at 10:17 AM, Jerry Cwiklik <cwik...@us.ibm.com>wrote:
>
>> Christian, after much pain and suffering I finally figured out what is
>> going
>> on. Our system is quite complicated and involves many producers that send
>> large messages (600K-1.5M) to a relatively few multi-threaded consumers
>> (services) which run "forever". The producers are transient and can be
>> killed by our custom job scheduler at any time via kill -9 to make room
>> for
>> other producers. We run the broker with 10G heap.
>>
>> The consumer is coded to group and cache Sessions with a Connection which
>> has an inactivity timer  associated with it. Every time a message is sent,
>> the timer is restarted. If the timer pops (default 30minutes), the
>> Sessions,
>> MessageProducers and a Connection are closed due to inactivity.
>>
>> This worked perfectly fine until about 4 weeks ago when we started
>> experiencing broker OOM problem. While the broker was running we could
>> see a
>> steady (fast) rise in the heap usage in a jConsole. After a couple of days
>> the broker's jvm would OOM.
>>
>> The problem started happening when we introduced pingers for the
>> Consumers.
>> Every minute a pinger sends a message to a Consumer to make sure its
>> alive.
>> The Consumer replies to the pinger request and restarts inactivity timer.
>> It
>> took me awhile to see the bug in our application, but eventually I
>> determined that our timer behaves incorrectly as it is associated with a
>> Connection not individual Sessions. The Sessions go stale due to producer
>> getting killed, and any messages in the broker referenced by
>> ProducerExchange object are retained indefinitely causing a leak in the
>> broker. As you explained it to me, the broker uses lazy approach to
>> cleanup.
>> Meaning it cleans up on a new message from the Producer. In our case, the
>> Producer never sends anything and thus no cleanup is ever done.
>>
>> The fix for this is to create a timestamp for every Session when it was
>> last
>> used to message to the broker. At fixed intervals a Session Reaper thread
>> wakes up and checks the timestamp of every Session to determine if it has
>> been inactive for a max allowed time and if so, to close it.
>>
>> So the problem was caused by an application bug and the fact that the
>> broker
>> takes a lazy approach to cleanup. As a side note, under the described
>> scenario, I've noticed that the broker memory usage (shown in jConsole)
>> indicated 0 even though there were ton of messages in the heap with valid
>> references (held by ProducerExchange).
>>
>> Thanks Christian for your help
>>
>> -Jerry C
>>
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://activemq.2283324.n4.nabble.com/Broker-Leak-tp4660437p4660618.html
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
>
>
>
> --
> *Christian Posta*
> http://www.christianposta.com/blog
> twitter: @christianposta
>
>


-- 
*Christian Posta*
http://www.christianposta.com/blog
twitter: @christianposta

Reply via email to