Re: [EXTERNAL] Re: Re: General Architecture Question for multiple websites on a single RedHat server

Mark Eggers Tue, 10 Jul 2012 11:30:29 -0700

----- Original Message -----

> From: "Simon, Leonard" <leonard.si...@hsn.net>
> To: Tomcat Users List <users@tomcat.apache.org>
> Cc: 
> Sent: Tuesday, July 10, 2012 9:54 AM
> Subject: Re: [EXTERNAL] Re: Re: General Architecture Question for multiple 
> websites on a single RedHat server
> 
> Chris,
> 
> Thanks for looking at this.
> 
> Tomcat version is 6.0.32.
> mod_jk is at 1.2.31
> 
> 
> Someone else did the thread dump so I'm assuming they did it on the right
> process.
> 
> On Tue, Jul 10, 2012 at 12:19 PM, Christopher Schultz <
> ch...@christopherschultz.net> wrote:
> 
>>  -----BEGIN PGP SIGNED MESSAGE-----
>>  Hash: SHA1
>> 
>>  Simon,
>> 
>>  On 7/9/12 4:24 PM, Simon, Leonard wrote:
>>  > Well our Tomcat went out to lunch again and we had to recycle the
>>  > webserver to get things stablized. By this I mean we get reports
>>  > from the users that screens become unresponsive and looking at a
>>  > top we see tomcat process taking 100% CPU.
>> 
>>  Are you sure this is the right process?
>> 
>>  > Was able to do a thread dump captured with a kill -3 PID and here
>>  > it is if anyone is so inclined to comment on it.
>> 
>>  This thread dump shows a mostly-idle server with the exception of
>>  those threads in socketAccept() (not sure why these count as RUNNABLE
>>  when they are really blocking) and those executing reads from the
>>  client connection(s).
>> 
>>  What exact version of Tomcat are you using, and what version of mod_jk
>>  (or, if you are using mox_proxy_ajp, what httpd version)? IIRC, there
>>  have been some stability improvements in recent Tomcat versions around
>>  the worker threads being returned to their associated connectors.
>> 
>>  - -chris



I didn't see much in the way that rang immediate alarm bells. It looks like 
you're processing about 18 client connections, and everything else is pretty 
quiet. These client connections are going through the AJP connector (as you've 
noted in your reply above).

A few things though:

As someone in this thread has already mentioned, permgen is pretty full. You 
might try increasing that with -XX:MaxPermSize=128m.

There are a lot of garbage collection threads. You can see this on a multi-core 
system. From digging around, it appears that the number of parallel garbage 
collection threads follows this formula:

8 + (5/8)X = GCT

You get one GCT (garbage collection thread) per core for the first 8 cores, and 
then 5/8 of a thread for every core after that. So in your case:

8 + (5/8)X = 18
X = 16

This means that your system has 24 cores. Are you running on a 24 core system, 
or have you tuned garbage collection with JVM arguments.

In general, if you're not running into GC issues, tuning GC parameters is 
counter-productive. If you do have to tune GC parameters, lots of testing is in 
order.

I noticed that you also have an MQ Trace monitor running. Are you using MQ? 
Directly accessing an MQ service without going through a pool configured for 
graceful restarts / retries can cause a system to become unresponsive. However, 
I don't see any evidence of that in this thread dump.

As I've said off line, it's really difficult to see what's consuming CPU from 
one thread dump. Here's how to start figuring out what is going on with your 
system.

1. Keep access logs

If you don't, then start. You'll want the access logs to replay on a test 
environment to see if you can recreate the problem. JMeter is a good tool for 
replaying information from access logs.

2. When the problem occurs

a. Multiple thread dumps, about 5 seconds apart. Use a tool like jstack so it's 
scriptable

   jstack -l [process-id]
   where [process-id] is the process id of the distressed Tomcat

The -l generates a long listing, and may not be necessary. You'll need to have 
the right permissions (either root or the user running the JVM being targeted 
with the process id).

b. At the same time use something like the following to see which thread is 
consuming CPU:

   ps -L -o pcpu,lwp -p [process-id]
   where [process-id] is the process id of the distressed Tomcat

This will show all the threads of the process, the percentage of CPU used for 
each thread, and the thread process ID. You can then correlate the thread 
process ID with the thread dump to see exactly what is consuming the CPU.

This will generate tons of output, so it's best to put both in a script and 
direct the output to files.

Now you'll end up with the following:

1. What requests were being made of your server when the problem occurred
2. Multiple thread dumps while the problem is occurring
3. The identity of the thread (or threads) that is consuming the CPU

Once you get this information, you'll be in a much better position to determine 
what is causing your problems.

. . . . just my two cents.
/mde/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: [EXTERNAL] Re: Re: General Architecture Question for multiple websites on a single RedHat server

Reply via email to