There were 29 httpd processes running when the websites crashed.

One thing that was clear from the thread dumps was there were lots of
entries for org.jasig.cas.ticket.registry.support.DefaultTicketRegistryCleaner:
Starting cleaning of expired tickets from ticket registry at ...

Does that indicate that there is something wrong with my CAS configuration?

Here's the thread count from the first thread dump:
DefaultQuartzScheduler_QuartzSchedulerThread
4 sleeping + 2 waiting on condition = 6 Total

DefaultQuartzScheduler_Worker
60 waiting = 60 Total

Store ticketCache Expiry Thread (ehcache)
6 waiting on condition = 6 Total

Store ticketCache Spool Thread (ehcache)
6 waiting on condition = 6 Total

TP-ProcessorXY daemon runnable
20 runnable (mod_jk?) + 3 waiting = 23 Total

Java2D Disposer daemon
1 waiting = 1 Total

http-8443-Monitor
1 waiting = 1 Total

http-8443-Processor
3 waiting + 1 runnable = 4 Total

ContainerBackgroundProcessor
1 waiting on condition = 1 Total

Low Memory Detector
1 runnable = 1 Total

CompilerThread
2 waiting on condition = 2 Total

AdapterThread
1 waiting on condition = 1 Total

Signal Dispatcher
1 runnable = 1 Total

Finalizer
1 waiting = 1 Total

Reference Handler
1 waiting = 1 Total

org.apache.catalina.startup.Bootstrap.main
1 runnable = 1 Total

VM Thread
1 runnable = 1 Total

GC task thread
1 runnable = 1 Total

VM Periodic Task Thread
1 waiting on condition = 1 Total


Thanks,
Joe

On Tue, Oct 6, 2009 at 11:44 AM, Joe Hansen <joe.hansen...@gmail.com> wrote:
> Rainer,
>
> I spoke to soon! As I suspected, the problem isn't fixed yet and the
> websites crashed again. This time I took three thread dumps at the
> time tomcat was down. Here they are: http://pastebin.com/m2a7e1198
>
> I will learn from your previous analysis of the thread dumps and I try
> to understand what's happening.
>
> Thanks,
> Joe
>
> On Tue, Oct 6, 2009 at 10:23 AM, Joe Hansen <joe.hansen...@gmail.com> wrote:
>> Rainer,
>>
>> Thanks for looking at those long thread dumps for me!!
>>
>> I am sorry. I did NOT take these dumps at the right time (i.e. when
>> Tomcat was inundated with requests and couldn't cope with the load).
>> After I increased the heap size to 512MB (from 64MB default), I am not
>> getting the OutOfMemoryError(s) anymore. After I set KeepAlive On
>> (Thanks Andre!), the number httpd processes isn't increasing either.
>> The number of httpd processes increased from 8 to 21 and stayed there
>> for more than 16 hours now. If the number of httpd processes gets out
>> of control again, I will definitely take thread dumps once again.
>>
>> However, I doubt the issue is fixed because I should have seen this
>> issue long time ago (since I haven't changed any code nor the traffic
>> to our websites increased in a long while).
>>
>> Should I just wait and see or are there any tests that I can do?
>>
>> Your contribution to this forum is amazing, Rainer. I am grateful to
>> you and Andre for your efforts. Thank you!
>>
>> Regards,
>> Joe
>>
>>
>>
>> On Tue, Oct 6, 2009 at 7:25 AM, Rainer Jung <rainer.j...@kippdata.de> wrote:
>>> On 05.10.2009 18:58, Joe Hansen wrote:
>>>> Thank you so much for your tips, Rainer!
>>>>
>>>> The websites went down yet again. Increasing the java heap size took
>>>> care of the OutOfMemoryError, but the number of httpd processes keep
>>>> increasing until the websites crash. I haven't added any new code in
>>>> the past few months, hence I am surprised why the requests are getting
>>>> stuck. Here's a link to the tomcat thread dumps:
>>>> http://pastebin.com/m17eea139
>>>>
>>>> Please let me know if you cannot view it and I will email the relevant
>>>> portion of the catalina.out file to you. Is there an easy way to find
>>>> out what code is causing the requests to get stuck?
>>>
>>> The dump file contains three thread dumps.
>>>
>>> The things all dumps have in common:
>>>
>>> - 60 threads for the quartz scheduler, all idle
>>> - 13 threads in the AJP connection pool, connected to Apache, but idle
>>> waiting for the next request to be send (the same threads in all three
>>> dumps)
>>> - 6 store plus 6 expiry threads of the EHCache, seems idle
>>> - 1 AJP + 1 HTTP(S) thread (port 8443) waiting to accept the next new
>>> connection to come in
>>> - 2 AJP + 3 HTTP(S) threads (port 8443) sitting idle the pool, waiting
>>> for work
>>> - a couple of other normal threads not directly related to request handling
>>>
>>> So the time you took the three dumps, this Tomcat was completely idle
>>> and did not have a single request to handle.
>>>
>>> If you are completely sure, you took the dumps while there was a storm
>>> of requests and your system couldn't cope the load, something has
>>> prevented the requests to ever reach Tomcat.
>>>
>>> I don't have your Tomcat version at hand at the moment, but for some
>>> time very special OutOfMemory errors (could not create native thread)
>>> lead to a situation, where Tomcat simply wouldn't accept any new
>>> connections. Although you report OutOfMemory errors, I'm not directly
>>> suggesting that that is your problem here. There might still be a
>>> relation though.
>>>
>>> Are you sure, that you took the dumps for the right Tomcat at the right
>>> time?
>>>
>>> Regards,
>>>
>>> Rainer
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to