The only time I have seen something like this happen is when we had a memory
leak, an interesting one.
We were running on solaris, our CPU, sometimes one, sometimes both would go
up to 100% and stay there. sometimes for as much as 10 minutes and then go
back to normal again.

We ran thread analyzer (a neat util on solaris) to measure which thread was
using a lot of cpu, and found out it was a VM thread.

In our scenario, we had a thread being an inner class of an object, and the
thread (inner class) referenced the parent and the parent referenced the
thread. Even though nothing referenced any of the two objects, it would not
be garbage collected, I suspect it had to do with the fact that the inner
class was a thread object.

Another guess we had was that the VM thread was freakin out cause it tried
to resolve garbage collection references and kept looking our weird
dependency.

Once we resolved the memory leak (removed any references from the parent
object to the thread and vice versa) our CPU usage never behaved weirdly
again.

hope this helps
Filip


-----Original Message-----
From: Martin Schulz [mailto:[EMAIL PROTECTED]
Sent: Friday, June 25, 2004 8:24 PM
To: Tomcat Developers List
Subject: Re: Any synchronization issues with SMP?


it appears that the JVM slows everything down to a crawl,
including the code path which should lead to another accept being
called., for up to 8 minutes!!!

Furthermore, the mpstat has the nice porperty that CPU usage adds
up to exactly 100%, i.e. a single CPU is used... no more, no less.
This corresponds to 12% or 13% CPU utilization shown in prstat
based on 8 CPUs.  My interpretation is that the JVM is effectively
preventing parallel execution (which otherwise appears to work fine).

Nearly all threads either wait, read from a Socket, or zip/unzip data.

I'm not sure what all that means, but Tomcat appears to be a victim
of it.  I'll experiment some more.  Main difference with the systems
Rainer mentioned is the JVM (1.4.2_04) and the CPU (Sparc III 1.2GHz).

If any of this rings a bell, drop me a note.  I'll be happy to share data
as appropriate.

I'll repost to the list only if I learn anything which impacts Tomcat
directly
(other than that the code path to hand of the socket accept responsibility
is not suitable for _very_ high hit rates, which does not worry me too
much at this point).

Cheers!
    Martin

Martin Schulz wrote:

> Rainer,
>
> Thanks for the tips.  I am about to take timing stats
> internally in the ThreadPool and the Tcp workers.
> Also, the described symptoms do not disappear, but seem to be of much
> shorter duration when only 4 CPUs are used for the application.
> I'll summarize what I find.
>
> Martin
>
> Rainer Jung wrote:
>
>> Hi,
>>
>> we know one application running on 9 systems with 4 US II CPUs each
>> under Solaris 9. Peak request rates at 20 requests/second per system.
>> Tomcat is 4.1.29, Java is 1.3.1_09. No symptoms like yours!
>>
>> You should send a signal "QUIT" to the jvm process during the
>> unresponsiveness time. This is a general JVM mechanism (at least for sun
>> JVMs). The signal writes a stack trace for each thread on STDOUT (so you
>> should also start tomcat with redirection of STDOUT the output to some
>> file). Beware: older JVM in rare cases stopped working after getting
>> this signal (not expected with 1.3.1_09).
>>
>> In this stack dump you should be able to figure out, in which methods
>> most of your threads stay and what the status is.
>>
>> Is there native code included (via JNI)? Any synchronization done in the
>> application itself? Are you using Tomcat clustering? Which JVM?
>>
>> Sincerely
>>
>> Rainer Jung
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---
Incoming mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.706 / Virus Database: 462 - Release Date: 6/14/2004

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.706 / Virus Database: 462 - Release Date: 6/14/2004


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to