Thanks Lucas. The Bug id is 2978990. In my example I just using "ping localhost" to pretend a long time run script. ping localhost never end unless the process is being killed. And it's hard to estimate the time of our test script, so I can not set a timeout to the command. That why I ask help here:) Very appreciate.
----- Original Message ----- From: "Sharon Lucas" <luc...@us.ibm.com> To: "Ren Yang" <ry...@redhat.com> Cc: staf-users@lists.sourceforge.net Sent: Monday, March 29, 2010 11:20:42 PM GMT +08:00 Beijing / Chongqing / Hong Kong / Urumqi Subject: Re: [staf-users] STAF did not return if reboot the other machine. Open a bug and we'll get this problem fixed. It has to do with the target machine never sending a message that the process completed or terminated due to STAFProc never being shutdown properly before the reboot occurred. But there are other ways to work around this issue. For example, instead of using the PROCESS service to run a system ping command to ping the target machine, use the STAF PING service as following to ping the target machine: STAF targetMachine PING PING This STAF PING request should fail if the target machine is rebooted before the ping request completes. Or, you really need to use a PROCESS START request to ping the machine for some reason, specify a maximum wait time, e.g. WAIT 1m) so that the PROCESS START request will timeout if the command does not complete within 45 seconds (or whatever time period you want to specify): STAF targetMachine PROCESS START SHELL COMMAND "ping localhost" WAIT 45s RETURNSTDERR RETURNSTDOUT -------------------------------------------------------------- Sharon Lucas IBM Austin, luc...@us.ibm.com (512) 286-7313 or Tieline 363-7313 Ren Yang <ry...@redhat.com> 03/29/2010 04:59 AM To staf-users@lists.sourceforge.net cc Subject [staf-users] STAF did not return if reboot the other machine. Hi all, I meet a problem when using STAF in Linux. I try to use STAF to run some long time script. But during the script running if I reboot test machine STAF will hang up and did not return forever. The STAF version is 3.4.0 For example: 1. #STAF targetmachine process start shell command ping localhost wait returnstderr returnstdout 2. Then I reboot targetmachine. The STAF command will hang up and never return. I think it should end and return some thing like this. Error submitting request, RC: 22 Additional info --------------- STAFConnectionReadUInt: Error reading from socket: other side closed socket: 22 This message will show if I kill staf process on that machine. I think reboot machine also will kill staf process. Thanks, Yang ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ staf-users mailing list staf-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/staf-users ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ staf-users mailing list staf-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/staf-users