Re: [staf-users] high CPU utilization by STAFProc and STAX JVM

Nathan Parrish Tue, 03 Nov 2009 18:11:22 -0800

right, I understand this.  my question is how 'thread growth' works.  
presumably the delta controls some behavior regarding how many new threads we 
spin off when the system detects that we are running low/out of threads.  
also, are threads ever reaped?  or if my system was incredibly busy at some 
time in the past, such that 270 threads were really all going at once, will I 
have all those threads sitting around even if they are no longer unneeded?  is 
it possible that the overhead of this many idle threads could be causing my CPU 
churn?


thanks,
nathan

----- "Sharon Lucas" <luc...@us.ibm.com> wrote:

> No, the STAF MISC service simply shows the initial threads and the
> thread growth delta. It doesn't show the number of threads that STAF
> has created.
> 
> --------------------------------------------------------------
> Sharon Lucas
> IBM Austin, luc...@us.ibm.com
> (512) 286-7313 or Tieline 363-7313
> 
> 
> 
>       Nathan Parrish <nparr...@clustrix.com>
> 
> 11/03/2009 02:43 PM
> To    Sharon Lucas/Austin/i...@ibmus
> 
> cc    staf-users@lists.sourceforge.net
> 
> Subject       Re: [staf-users] high CPU utilization by STAFProc and STAX
> JVM
> 
> 
> 
> 
> Our longevity test has finally finished, so I'm able to poke around a
> bit deeper, and can look at doing things like restarting the STAX
> service, or STAFProc itself later this afternoon.
> 
> I enabled tracing, and I see traffic on the order of maybe 5-20
> queries a second, mostly the VAR service (90+% gets), and some
> semaphore requests as well. I'll get breaks in traffic as long as 3
> seconds...
> 
> short of doing gdb, I tried strace -fp on the process, and noticed
> that it was quite busy, and was also dealing with a very large number
> of child processes. I believe linux does some weird stuff mixing what
> constitutes a process vs. a thread; I don't see all these process with
> ps, but pstree -p does show them, something like ~270 (not counting
> the java procs/threads which are also underneath the main STAFProc
> PID). looking on another machine (which is not running STAX, or
> servicing tons of variable requests, etc.), it has 82. my desktop,
> which does run STAX for testing purposes, looks to have 90.
> 
> from misc settings:
> Initial Threads : 10
> Thread Growth Delta : 1
> 
> does this suggest that some threads have gotten spun off into
> never-never land and other threads have been created as a result?
> 
> 
> 
> ----- "Sharon Lucas" <luc...@us.ibm.com> wrote:
> > Unfortunately, I didn't really get many clues as to why STAFProc is
> > using up so much CPU from the information you provided. Does the CPU
> > usage for STAFProc constantly stay very high (e.g. above 100%)?
> >
> > What are your STAX jobs doing? Can you give a description of some of
> > the STAF service requests that they are submitting.
> >
> > Since STAFProc was started 21 days ago on your STAX service machine,
> > 1,832,518,328 STAF service requests have been submitted to this
> > machine. That's a lot of STAF service requests. Do you have any
> > "rogue" STAX jobs that are constantly submitting STAF service
> requests
> > in a loop (without a good reason)? What STAF service requests are
> > being submitted the most?
> >
> > Yes, you can enable STAF tracing to see if that gives any clues
> about
> > what's driving the CPU load (though note that enabling STAF tracing
> > may slow things down a little). To see what STAF service requests
> are
> > being submitted and when each STAF service request completes, you
> > could enable trace points ServiceRequest, ServiceComplete, and
> > RemoteRequests. Note that this will generate tons of trace output
> > since lots of STAF service requests are being submitted to this
> > machine, so you'll also want to redirect STAFProc's trace output to
> a
> > file in a location where there is lots of available disk space and
> > monitor the size of this file. You may also want to enable the
> Warning
> > tracepoint so that any warning messages are also logged.For example:
> >
> > STAF staxMachine TRACE SET DESTINATION TO FILE
> > /usr/local/staf/STAFProc.trc
> > STAF staxMachine TRACE ENABLE TRACEPOINTS "ServiceRequest
> > ServiceComplete RemoteRequests Warning"
> >
> > See section "8.18 Trace Service" in the STAF User's Guide for more
> > information.
> >
> > I don't know if the trace output will help us in determining why
> > STAFProc is using up so much CPU. Maybe there is some thread in
> > STAFProc that is in a bad state and constantly looping for some
> > unknown reason (like the STAFProcessMonitor thread that monitors for
> > processes to complete). The only way I know of to check this would
> be
> > if STAFProc was started via a debugger like gdb. Then, once it got
> in
> > this "bad" state of high CPU usage, you could break in via gdb and
> > list threads (info threads) and change to each thread (thread n) and
> > check the backtraces (bt) for each thread to see what they are
> doing.
> > For example:
> >
> > Use gdb to debug STAF locally as follows:
> > 1. gdb STAFProc
> > 2. run
> > 3. Recreate the problem.
> >
> > Various commands that you might need while using gdb are:
> >
> > • help
> > • help tracepoints
> > • help stack
> > • info threads
> > • thread n
> > • bt
> >
> > Of course, this will require that STAF be shut down and then
> restarted
> > using gdb so you may not be able to do that now while your long runs
> > are still running.
> >
> > There's no reason that I know of not to use a 1.6.0 JVM with the
> STAX
> > service so you can try that instead of upgrading to a more recent
> > 1.5.0 JVM.
> >
> > You may want to increase the STAX service's MaxFileCacheSize from 20
> > to something like 50. This won't "fix" the problem of STAFProc using
> > 100%+ CPU, but whenever a STAX job is executed, it first needs to be
> > XML parsed and this is a very CPU-intensive process (so if the CPU
> > usage is already high, it will take longer for STAX to parse a STAX
> > job before execution of the STAX job begins). So, if you are running
> > the same STAX job file more than one, the first time STAX needs to
> > parse it, but then it will cache it so that if the exact same STAX
> job
> > is submitted to be executed again, it doesn't have to be re-parsed
> (if
> > it's still in the STAX file cache). So, that's why I recommended
> > increasing the STAX service's MaxFileCacheSize. It can be increased
> > dynamically as follows:
> >
> > STAF staxMachine STAX SET MAXFILECACHESIZE 50
> >
> > Note that this setting only applies to this instance of the STAX
> > service. If you shutdown and restarted STAFProc, it would no longer
> > apply. You would want to add this setting when registering the STAX
> > service in your STAF.cfg file to make it "permanent".
> >
> > You may also want to increase the maximum heap size for the STAX JVM
> > from 384m to something like 512m by specifying OPTION J2=-Xmx512m
> > (instead of OPTION J2=-Xmx384m).
> >
> > SERVICE STAX LIBRARY JSTAF EXECUTE
> > /usr/local/staf/services/stax/STAX.jar \
> > OPTION JVM=/usr/java/jdk1.6.0_16/bin/java OPTION JVMName=STAX \
> > OPTION J2=-Xmx384m PARMS "MAXFILECACHESIZE 50"
> >
> > --------------------------------------------------------------
> > Sharon Lucas
> > IBM Austin, luc...@us.ibm.com
> > (512) 286-7313 or Tieline 363-7313

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

Re: [staf-users] high CPU utilization by STAFProc and STAX JVM

Reply via email to