Hi Paul,

As you read in the STAX UG, when a timer expires, it attempts to stop any 
processes contained within the timer element that are still running (this 
is documented in the <timer> section in the STAX User's Guide at 
http://staf.sourceforge.net/current/STAX/staxug.html#Header_Timer.  STAX 
submits a STOP request to the PROCESS service to stop a process.  Note 
that if the STAF PROCESS STOP request is not able to stop the process 
using the specified stop method (default is SIGKILLALL), then the 
underlying process won't be terminated.  However, even if STAF cannot 
actually stop the underlying process on the machine, the <process> element 
itself will not "hang" if the PROCESS STOP request doesn't work.  It 
should continue on.  Note that the cause of a STAX <process> element 
"hanging" is usually because the STAX job did not receive the 
STAF/Process/End message and its handle's queue.  The STAF/STAX FAQ talks 
about this problem at 
http://staf.sourceforge.net/current/STAFFAQ.htm#d0e2072 in sections "
4.1.2. Why is STAX still showing a process as running, even though it has 
completed?" and "3.1.4 Why can't my STAF machines communicate? ".  Follow 
the instructions in section 3.1.4 to see if the machine where the process 
is running can successfully submit STAF requests to the STAX service 
machine using the host name of the STAX service machine.  Note that to 
send a process completion message to the STAX service machine, the process 
machine submits a QUEUE request to the STAF QUEUE service to send a 
STAF/Process/End message type to the STAX job handle's queue.  This 
message is sent when a process completes normally and when a process is 
stopped by a PROCESS STOP method.  If this QUEUE request fails (e.g. with 
RC 16 etc), then the STAX service never receives the message that the 
process is no longer running.  So, if the process machine cannot 
communicate via STAF to the STAX service machine, then you need to fix 
this problem.

To further debug this problem, you can turn on STAF tracing for 
tracepoints "RemoteRequests ServiceRequest ServiceResult" and for the 
QUEUE service on the machine where the process is running so that you can 
see the submission of the STAF/Process/End message to the STAX job 
handle's queue and see if this QUEUE request was successful or failed 
(e.g. with RC 16 if it cannot communicate to the STAX service machine 
using the STAX service machine's host name).

--------------------------------------------------------------
Sharon Lucas
IBM Austin,   luc...@us.ibm.com
(512) 286-7313 or Tieline 363-7313




From:   Paul Ellsworth/San Jose/IBM@IBMUS
To:     staf-users@lists.sourceforge.net
Date:   02/28/2011 01:30 PM
Subject:        Re: [staf-users] hung processes and timers



I thought I had already read all that the STAX UG said about this, but 
apparently not. It looks like when a timer pops it's supposed to kill any 
processes it envelops ...

So perhaps the problem I am having is the nature of the "error." The 
process actually does not exist on the machine anymore, but STAF/STAX is 
still waiting for it. 

Paul Ellsworth---02/28/2011 11:24:57 AM---We use STAF/STAX for testing, so 
this particular problem is in a STAX test "server" -> STAF "client"


From:

Paul Ellsworth/San Jose/IBM@IBMUS

To:

staf-users@lists.sourceforge.net

Date:

02/28/2011 11:24 AM

Subject:

[staf-users] hung processes and timers



We use STAF/STAX for testing, so this particular problem is in a STAX test 
"server" -> STAF "client" situation.

Occasionally, I get a hung process... what seems to happen is that the 
command returns, but for whatever reason, the STAX machine doesn't get the 
return. Normally, this wouldn't be too difficult; just put a timer on the 
process (for example, we install our product on AIX using installp and it 
shouldn't take more than a minute or two at most).

However, it seems that timers do not ... interrupt, I guess, a process? 
Quick code example (on-the-fly, not copy/paste from actual XML):

<timer duration="'1m'">
<sequence>
<process name="'cat /etc/filesystems'">
<location>machineIP</location>
<command mode="'shell'">"cat /etc/filesystems"</command>
<stderr mode="'stdout'" />
<returnstdout />
</process>
<log level="'trace'">"Got /etc/filesystems:\n%s" % STAXResult[0][1]</log>
</sequence>
</timer>
<if expr="RC != 0">
<log level="'error'">"ERROR: trying to cat /etc/filesystems took longer 
than 1m!"</log>
</if>

I guess my question is: is it true that a timer cannot interrupt a "hung" 
process and force STAX to move on, or am I doing something wrong :)

And if it is true, is that even possible, or should I not open a feature 
request.

Thanks!
Paul E.
------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT 
data 
generated by your applications, servers and devices whether physical, 
virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT 
data 
generated by your applications, servers and devices whether physical, 
virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

<<image/gif>>

<<image/gif>>

<<image/gif>>

<<image/gif>>

<<image/gif>>

<<image/gif>>

<<image/gif>>

<<image/gif>>

<<image/gif>>

------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

Reply via email to