The only time I have seen something like this happen is when we had a memory leak, an interesting one. We were running on solaris, our CPU, sometimes one, sometimes both would go up to 100% and stay there. sometimes for as much as 10 minutes and then go back to normal again.
We ran thread analyzer (a neat util on solaris) to measure which thread was using a lot of cpu, and found out it was a VM thread. In our scenario, we had a thread being an inner class of an object, and the thread (inner class) referenced the parent and the parent referenced the thread. Even though nothing referenced any of the two objects, it would not be garbage collected, I suspect it had to do with the fact that the inner class was a thread object. Another guess we had was that the VM thread was freakin out cause it tried to resolve garbage collection references and kept looking our weird dependency. Once we resolved the memory leak (removed any references from the parent object to the thread and vice versa) our CPU usage never behaved weirdly again. hope this helps Filip -----Original Message----- From: Martin Schulz [mailto:[EMAIL PROTECTED] Sent: Friday, June 25, 2004 8:24 PM To: Tomcat Developers List Subject: Re: Any synchronization issues with SMP? it appears that the JVM slows everything down to a crawl, including the code path which should lead to another accept being called., for up to 8 minutes!!! Furthermore, the mpstat has the nice porperty that CPU usage adds up to exactly 100%, i.e. a single CPU is used... no more, no less. This corresponds to 12% or 13% CPU utilization shown in prstat based on 8 CPUs. My interpretation is that the JVM is effectively preventing parallel execution (which otherwise appears to work fine). Nearly all threads either wait, read from a Socket, or zip/unzip data. I'm not sure what all that means, but Tomcat appears to be a victim of it. I'll experiment some more. Main difference with the systems Rainer mentioned is the JVM (1.4.2_04) and the CPU (Sparc III 1.2GHz). If any of this rings a bell, drop me a note. I'll be happy to share data as appropriate. I'll repost to the list only if I learn anything which impacts Tomcat directly (other than that the code path to hand of the socket accept responsibility is not suitable for _very_ high hit rates, which does not worry me too much at this point). Cheers! Martin Martin Schulz wrote: > Rainer, > > Thanks for the tips. I am about to take timing stats > internally in the ThreadPool and the Tcp workers. > Also, the described symptoms do not disappear, but seem to be of much > shorter duration when only 4 CPUs are used for the application. > I'll summarize what I find. > > Martin > > Rainer Jung wrote: > >> Hi, >> >> we know one application running on 9 systems with 4 US II CPUs each >> under Solaris 9. Peak request rates at 20 requests/second per system. >> Tomcat is 4.1.29, Java is 1.3.1_09. No symptoms like yours! >> >> You should send a signal "QUIT" to the jvm process during the >> unresponsiveness time. This is a general JVM mechanism (at least for sun >> JVMs). The signal writes a stack trace for each thread on STDOUT (so you >> should also start tomcat with redirection of STDOUT the output to some >> file). Beware: older JVM in rare cases stopped working after getting >> this signal (not expected with 1.3.1_09). >> >> In this stack dump you should be able to figure out, in which methods >> most of your threads stay and what the status is. >> >> Is there native code included (via JNI)? Any synchronization done in the >> application itself? Are you using Tomcat clustering? Which JVM? >> >> Sincerely >> >> Rainer Jung > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --- Incoming mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.706 / Virus Database: 462 - Release Date: 6/14/2004 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.706 / Virus Database: 462 - Release Date: 6/14/2004 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]