On 06.10.2009 19:44, Joe Hansen wrote: I will only comment the threads for the AJP pool. Everything else seems not relevant:
Dump 1: 20 threads connected to Apache, waiting for the next request 3 threads idle in the pool 1 thread waiting for the next connection 0 threads working on requests Dump 2: 12 threads connected to Apache, waiting for the next request 5 threads idle in the pool 1 thread waiting for the next connection 14 threads working on requests Dump 3: 11 threads connected to Apache, waiting for the next request 5 threads idle in the pool 1 thread waiting for the next connection 15 threads working on requests Dump 4: 11 threads connected to Apache, waiting for the next request 5 threads idle in the pool 1 thread waiting for the next connection 15 threads working on requests Those busy threads in dump 2, 3 and 4 are the same except for one, which started only in dump 3. Most of them are busy working on data. They are *not* waiting for some other system, like database or similar. Out of the 14+15+15 thread stacks, that are interesting, 42 work on some DAOs. Those are the DAO methods they work on: 18 tw.beans.TWFotoSetDAO.getFotosetFotos 11 tw.beans.TWEmailUpdateListDAO.getEmailUpdateList 3 tw.beans.TWFotoSetListDAO.getAllFotosets 3 tw.beans.TWFotoSetDAO.getFotosetFotosQuery2 3 tw.beans.TWFotoSetDAO.getFotosetFotosQuery 3 tw.beans.TWEmailUpdateDAO.getEmailUpdate 1 tw.beans.TWEmailUpdateListDAO.getEmailUpdateListQuery It seems your application is CPU heavy. Either the data objects handled are to heavy weight (maybe some user having huge Fotoset or Email list) or the request rate is simply to large. Is the CPU saturated during the problems? I would activate a good access log and try to find out from that and your webapp logs what maybe special about these web requests or users. Regards, Rainer > I will learn from your previous analysis of the thread dumps and I try > to understand what's happening. > > Thanks, > Joe > > On Tue, Oct 6, 2009 at 10:23 AM, Joe Hansen <joe.hansen...@gmail.com> wrote: >> Rainer, >> >> Thanks for looking at those long thread dumps for me!! >> >> I am sorry. I did NOT take these dumps at the right time (i.e. when >> Tomcat was inundated with requests and couldn't cope with the load). >> After I increased the heap size to 512MB (from 64MB default), I am not >> getting the OutOfMemoryError(s) anymore. After I set KeepAlive On >> (Thanks Andre!), the number httpd processes isn't increasing either. >> The number of httpd processes increased from 8 to 21 and stayed there >> for more than 16 hours now. If the number of httpd processes gets out >> of control again, I will definitely take thread dumps once again. >> >> However, I doubt the issue is fixed because I should have seen this >> issue long time ago (since I haven't changed any code nor the traffic >> to our websites increased in a long while). >> >> Should I just wait and see or are there any tests that I can do? >> >> Your contribution to this forum is amazing, Rainer. I am grateful to >> you and Andre for your efforts. Thank you! >> >> Regards, >> Joe >> >> >> >> On Tue, Oct 6, 2009 at 7:25 AM, Rainer Jung <rainer.j...@kippdata.de> wrote: >>> On 05.10.2009 18:58, Joe Hansen wrote: >>>> Thank you so much for your tips, Rainer! >>>> >>>> The websites went down yet again. Increasing the java heap size took >>>> care of the OutOfMemoryError, but the number of httpd processes keep >>>> increasing until the websites crash. I haven't added any new code in >>>> the past few months, hence I am surprised why the requests are getting >>>> stuck. Here's a link to the tomcat thread dumps: >>>> http://pastebin.com/m17eea139 >>>> >>>> Please let me know if you cannot view it and I will email the relevant >>>> portion of the catalina.out file to you. Is there an easy way to find >>>> out what code is causing the requests to get stuck? >>> >>> The dump file contains three thread dumps. >>> >>> The things all dumps have in common: >>> >>> - 60 threads for the quartz scheduler, all idle >>> - 13 threads in the AJP connection pool, connected to Apache, but idle >>> waiting for the next request to be send (the same threads in all three >>> dumps) >>> - 6 store plus 6 expiry threads of the EHCache, seems idle >>> - 1 AJP + 1 HTTP(S) thread (port 8443) waiting to accept the next new >>> connection to come in >>> - 2 AJP + 3 HTTP(S) threads (port 8443) sitting idle the pool, waiting >>> for work >>> - a couple of other normal threads not directly related to request handling >>> >>> So the time you took the three dumps, this Tomcat was completely idle >>> and did not have a single request to handle. >>> >>> If you are completely sure, you took the dumps while there was a storm >>> of requests and your system couldn't cope the load, something has >>> prevented the requests to ever reach Tomcat. >>> >>> I don't have your Tomcat version at hand at the moment, but for some >>> time very special OutOfMemory errors (could not create native thread) >>> lead to a situation, where Tomcat simply wouldn't accept any new >>> connections. Although you report OutOfMemory errors, I'm not directly >>> suggesting that that is your problem here. There might still be a >>> relation though. >>> >>> Are you sure, that you took the dumps for the right Tomcat at the right >>> time? >>> >>> Regards, >>> >>> Rainer --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org