Note that when you used STAF, you didn't verify that you received a 
notification that the process completed. It's not the STAF PROCESS START 
request (with no WAIT) that is hanging, it's the notification that the 
process completed that isn't working thus STAX never gets notified that 
the process completed (and the STAX job hangs).  Also, you submitted the 
request using the STAX machine's IP address -- STAF submits the QUEUE 
request using the STAX machine's host name (not its IP address). The 
problem is that when the process completes, the machine where the process 
is running tries to send a STAF/Process/Complete message to the STAX 
machine and to do this it uses the host name of the STAX machine.  It's 
this QUEUE request that is submitted by the machine where the process is 
running to the STAX machine that is failing.

When using a <process> element, STAX starts the process asynchronously 
(using the NOTIFY ONEND option), so when the process completes on the 
remote machine, it submits a QUEUE request to the QUEUE service on the 
STAX machine to send it the STAF/Process/End process completion message. 
Note that it sends this to the hostname of the STAX service that was 
provides via the initial process start request. So, it would appear that 
the remote process machine cannot submit STAF requests to the STAX service 
machine via it's hostname (and note on Windows, if it can't find it's 
TCP/IP hostname via DNS, it defaults to it's NetBIOS name). This indicates 
a possible TCP/IP DNS configuration problem which is why to pointed you to 
section "4.1.2 Why is STAX still showing a process as running, even though 
it has completed?" (and the section it points to) in the STAF FAQ.

What is the result of submitting the following requests from the process 
machine to the STAX machine and from the STAX machine to the process 
machine?

  STAF <machine> MISC WHOAMI

Does the "Logical ID" in the output from the MISC WHOAMI requests contain 
a fully qualified domain name (e.g.  server1.company.com)?  If not, then 
this is probably why the QUEUE request is failing and why you need to fix 
the TCP/IP DNS configuration as talked about in the STAF FAQ.. 

Also, did you increase the CONNECTTIMEOUT in the STAF.cfg file on all the 
machines involved (the process machine as well as the STAX machine)?  If 
not, be sure to do that.

You can enable STAF tracing on the machine where the process is being run 
(not on the STAX machine) to get more information on why the QUEUE request 
to send the STAF/Process/End message to the STAX machine is failing.  To 
do this,you could enable the ServiceRequest, ServiceError, and 
RemoteRequests tracepoints for the QUEUE service and then run the PROCESS 
START request and redirect the STAF trace output to a file.

STAF <machine> TRACE ENABLE TRACEPOINTS "ServiceRequest ServiceError 
RemoteRequests"
STAF <machine> TRACE DISABLE ALL SERVICES
STAF <machine> TRACE ENABLE SERVICE QUEUE
STAF <machine> TRACE SET DESTINATION TO FILE C:/temp/STAFProc.trc

See the STAF User's Guide for more information on using the TRACE service.

Recreate the problem and check the STAFProc.trc file to see why the QUEUE 
request to send the STAF/Process/End message to the STAX machine is 
failing.

--------------------------------------------------------------
Sharon Lucas
IBM Austin,   luc...@us.ibm.com
(512) 286-7313 or Tieline 363-7313




From:   "William.Bai" <william....@tekelec.com>
To:     Sharon Lucas/Austin/IBM@IBMUS, 
Cc:     "staf-users@lists.sourceforge.net" 
<staf-users@lists.sourceforge.net>, "Liu,       Catherine" 
<catherine....@tekelec.com>
Date:   09/22/2011 08:43 PM
Subject:        Re: [staf-users] STAX controlled process seems almost 
hangs up



Hi Sharon:

         Thank you for your help very much. But I am afraid I do not 
explain my situation clearly. In fact, when I use STAF, I could 
communicate both from China to US and from US to China, when I use STAX, I 
could not communicate either from China to US or from US to China. As you 
can see, STAF works normally:
/home/u../wbai/wirelineloadtest :> staf 10.15.29.89 ping ping
Response
--------
PONG
/home/u../wbai/wirelineloadtest :> staf 10.15.29.89 process start shell 
command "cd /tmp/;unzip createenv.zip"
Response
--------
11

         But when I use STAX, the process only hangs up. In fact, I have 
set CONNECTTIMEOUT(if I do not set CONNECTTIMEOUT to 50000, even STAF 
could not work). My STAF.cfg is as following, Need I set anything else?
# Turn on tracing of internal errors and deprecated options
trace enable tracepoints "error deprecated"

# Enable TCP/IP connections
interface tcp library STAFTCP option CONNECTTIMEOUT=50000


# Default Service Loader Service
serviceloader library STAFDSLS


# Set default local trust
trust machine local://local level 5
trust default level 5

# Add default service loader
serviceloader library STAFDSLS

# Add service
SERVICE Cron LIBRARY JSTAF EXECUTE 
{STAF/Config/STAFRoot}/services/cron/STAFCron.jar

SERVICE STAX LIBRARY JSTAF EXECUTE 
{STAF/Config/STAFRoot}/services/stax/STAX.jar

SERVICE Http LIBRARY JSTAF EXECUTE 
{STAF/Config/STAFRoot}/services/http/STAFHTTP.jar

SERVICE EVENT LIBRARY JSTAF EXECUTE \
    {STAF/Config/STAFRoot}/services/stax/STAFEvent.jar
SET MAXQUEUESIZE 10000

BRs
William


 
On 09/22/2011 11:24 PM, Sharon Lucas wrote: 
My guess is that your machine in China cannot communicate via STAF to the 
US machine possibly due to a firewall issue or something else.  STAF 
communication in the other direction (US machine to China machine) works. 
Did you try submitting a STAF <US Machine> PING PING request from the 
China machine?  Does that fail with an RC 16? 

You said using STAF on the China machine to submit  a PROCESS START 
request worked.  I'm guessing you used the WAIT option on your STAF <US 
Machine> PROCESS START request submitted from the China machine.  A STAX 
<process> submits a STAF <US Machine> PROCESS START COMMAND <command> 
NOTIFY ONEND request  -- it does not specify the WAIT option.  This means 
that when the process has completed on the US Machine, it sends a 
STAF/Process/Complete message to the China STAX machine (via its host 
name) by submitting a QUEUE request to the QUEUE service on the China 
machine.  I'm guessing its that request that is failing (possibly due to a 
firewall issue, host name DNS issue, trust issue, etc, or may need to 
increase the CONNECTTIMEOUT value). 

Read sections "3.1.3 Explain RC 16 when attempting to send a STAF request 
to a remote machine" and  "3.1.4 Why can't my STAF machines communicate?" 
in the STAF FAQ at 
http://staf.sourceforge.net/current/STAFFAQ.htm#STAF%20machines%20can%27t%20communicate%20due%20to%20DNS%20issues
 
for more information on how to possibly resolve the issue.

--------------------------------------------------------------
Sharon Lucas
IBM Austin,   luc...@us.ibm.com
(512) 286-7313 or Tieline 363-7313




From:        "William.Bai" <william....@tekelec.com> 
To:        "staf-users@lists.sourceforge.net" 
<staf-users@lists.sourceforge.net>, 
Date:        09/22/2011 04:13 AM 
Subject:        [staf-users] STAX controlled process seems almost hangs up 




Hi:

         When I use STAX(3.4.3) to control remote host to execute unzip
program("unzip createnv.zip -d /tmp"), I found the following problem.
The program seems just hanging there. My control host(on which STAX is
installed) is in China, and controlled host is US.
         When I try putting both control host and controlled host in
China, I found it works well. Also when I try putting both control host
and controlled host in US, it also works well. Only when I use the
machine in China to control the host in US, the process will hang up.
What's more, if I do not use STAX, just use STAF on China machine to
control remote host in US, It could end normally in about 15 seconds.

         The bad network between China and US might be one reason for
this problem. But Why STAF could work while STAX could not? By using
STAX, there are more communications between control machine and
controlled machine than by STAF? Do you have some good ideas how could I
avoid it? Thank you.

         My parameters in STAX to execute command is like following:
   <function name="startcommand">

       <function-prolog>
           This function is to start jcmts
       </function-prolog>
       <function-map-args>
             <function-required-arg name="command">
               the command to be executed
           </function-required-arg>
           <function-required-arg name="parameters">
               the command parameters
           </function-required-arg>
           <function-optional-arg name="machine" default="'local'">
               the name of machine where the test process should run
           </function-optional-arg>
           <function-optional-arg name="processName" default="'A 
Process'">
               The name of the process.
           </function-optional-arg>
       </function-map-args>
       <sequence>
           <process name="processName">
               <location>machine</location>
               <command>command</command>
               <parms>parameters</parms>
 
<env>'LD_LIBRARY_PATH=/usr/local/lib:/opt/seagull/build-1.8.1'</env>
               <stderr mode="'stdout'"/>
               <returnstdout/>
           </process>



BRs
William 

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users


------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

Reply via email to