Sorry about that, I forgot version numbers. STAX 3.5.0 and STAF 3.4.6. JVM log... actually yes, there was a null pointer exception. I also get random data in there that seems to come from a log message that gets generated when a testcase has an error - a couple tail commands of logs on the target system, etc. Here's the null pointer:
20110812-10:34:43 ERROR: Exception on STAX service request: list jobs java.lang.NullPointerException at com.ibm.staf.service.stax.STAX.handleList(STAX.java:3172) at com.ibm.staf.service.stax.STAX.acceptRequest(STAX.java:1843) at com.ibm.staf.service.STAFServiceHelper.callService (STAFServiceHelper.java:349) That was back on the 12th though, a week ago... The message error is: 20110802-09:55:51 Error: STAX Job ID 18. STAXJob$STAFQueueMonitor.run(): Exception unmarshalling queued messages. Marshalled string: @SDT/*:15769:@SDT/{:761::13:map-class-map@SDT/{:733::24:STAF/Service/Queue/Entry@SDT/{:694::4:keys@SDT/[8:633:@SDT/{:91::12:display-name@SDT/$S:8:Priority:18: display-short-name@SDT/$S:1:P:3:key@SDT/$S:8:priority@SDT/{:60::12:display-name@SDT/$S:9:Date-Time:3:key@SDT/$S:9:timestamp@SDT/{:56::12:display-name@SDT/$S:7:Machine:3:key@SDT/ $S:7:machine@SDT/{:101::12:display-name@SDT/$S:11:Handle Name:18:display-short-name@SDT/$S:4:Name:3:key@SDT/$S:10:handleName@SDT/{:88::12:display-name@SDT/$S:6:Handle:18:display -short-name@SDT/$S:2:H#:3:key@SDT/$S:6:handle@SDT/{:50::12:display-name@SDT/$S:4:User:3:key@SDT/$S:4:user@SDT/{:50::12:display-name@SDT/$S:4:Type:3:key@SDT/$S:4:type@SDT/{:56::1 2:display-name@SDT/$S:7:Message:3:key@SDT/$S:7:message:4:name@SDT/$S:24:STAF/Service/Queue/Entry@SDT/[1:14983:@SDT/%:14970::24:STAF/Service/Queue/Entry@SDT/$S:1:5@SDT/$S:17:2011 0802-09:55:51@SDT/$S:36:tcp://sjx64galb.sanjose.ibm.com@6500@SDT/$S:12:STAF_Process@SDT/$S:1:1@SDT/$S:16:none://anonymous@SDT/$S:16:STAF/Process/End@SDT/$S:14754:@SDT/{:14741::1 2:endTimestamp@SDT/$S:17:20110802-04:47:33:8:fileList@SDT/[1:14614:@SDT/{:14601::4:data@SDT/$S:14564:OUTPUT OF MOUNT [.... more stuff...] I've seen those before. That happened weeks ago, so it doesn't seem related. Next time I get the error, I'll run the query job thread one command, I have not done that before. Here's the STAX JVM log when it started: ****************************************************************************** *** 20110801-10:30:37 - Start of Log for JVMName: STAX *** JVM Executable: /usr/bin/java *** JVM Options : -Xmx1024m -cp /usr/local/staf/tools/zxJDBC/lib/zxJDBC.jar:/usr/local/staf/lib/JSTAF.jar:/usr/local/staf/samples/demo/STAFDemo.jar:/usr/local/staf/lib/JSTAF.j ar:/usr/local/staf/samples/demo/STAFDemo.jar:/usr/local/staf/lib/JSTAF.jar:/usr/local/staf/samples/demo/STAFDemo.jar -XX:MaxPermSize=512m -XX:PermSize=512m *** JVM Version : java version "1.6.0" Java(TM) SE Runtime Environment (build pxi3260sr8fp1-20100624_01(SR8 FP1)) IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux x86-32 jvmxi3260sr8ifx-20100609_59383 (JIT enabled, AOT enabled) J9VM - 20100609_059383 JIT - r9_20100401_15339ifx2 GC - 20100308_AA) JCL - 20100624_01 *** JVM PID : 11917 ****************************************************************************** I recently added the perm sizes to see if that would help at all, but it didn't appear to. From: Sharon Lucas/Austin/IBM To: Paul Ellsworth/San Jose/IBM@IBMUS Cc: staf-users@lists.sourceforge.net Date: 08/19/2011 01:15 PM Subject: Re: [staf-users] Job hangs, cannot terminate What version of STAX are you running on this machine (STAF local STAX VERSION) and what version of STAF are you running on this machine (STAF local MISC VERSION)? If you are not running STAX V3.3.8 or later, you could be running into Bug #2832883 that was a race problem where a STAX job could hang if a finally element's task completes very quickly (see https://sourceforge.net/tracker/?func=detail&aid=2832883&group_id=33142&atid=407381 for more info on this bug). Were there any errors in the STAX JVM Log? To debug, determine what the last element is on the call stack for the "hung" STAX job by using the STAX LIST JOB <JobID> THREADS and STAX QUERY JOB <JobID> THREAD <ThreadID> commands. For more information, see section "Debugging" in the STAX User's Guide at http://staf.sourceforge.net/current/STAX/staxug.html#Header_Debugging. -------------------------------------------------------------- Sharon Lucas IBM Austin, luc...@us.ibm.com (512) 286-7313 or Tieline 363-7313 From: Paul Ellsworth/San Jose/IBM@IBMUS To: staf-users@lists.sourceforge.net, Date: 08/19/2011 02:36 PM Subject: [staf-users] Job hangs, cannot terminate Hello, Occasionally, our STAX server gets into a position where even if I tell it to terminate a job, the job ends up still running: |--+----+-----------------+---------------------------------| |40|Info|20110819-12:01:11|Terminating block: main | |--+----+-----------------+---------------------------------| |39|Info|20110819-12:01:11|Received TERMINATE BLOCK main | | | | |request | |--+----+-----------------+---------------------------------| Oddly enough, I can print something (running a python command via a thread). There is no process or stafcmd going on, and I'm unable to terminate any of the blocks. The only way I've found to get rid of the job is to restart STAF on the server. This is somewhat inconvenient at times. Usually, when it gets into this state, new jobs also will get hung. I've checked for JVM memory errors in the STAF JVM logs but I've never seen them, even though the machine gets quite low on free physical memory at times (under 100mb, the machine has 4gb). Any ideas or ways to debug jobs that are in this state? Thanks! Paul ------------------------------------------------------------------------------ Get a FREE DOWNLOAD! and learn more about uberSVN rich system, user administration capabilities and model configuration. Take the hassle out of deploying and managing Subversion and the tools developers use with it. http://p.sf.net/sfu/wandisco-d2d-2 _______________________________________________ staf-users mailing list staf-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/staf-users
<<inline: graycol.gif>>
<<inline: ecblank.gif>>
------------------------------------------------------------------------------ Get a FREE DOWNLOAD! and learn more about uberSVN rich system, user administration capabilities and model configuration. Take the hassle out of deploying and managing Subversion and the tools developers use with it. http://p.sf.net/sfu/wandisco-d2d-2
_______________________________________________ staf-users mailing list staf-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/staf-users