When you say that "staf appeared to just quit", are you saying that 
STAFProc was terminated on machine afw2008r2a ?  If so, you should check 
STAFProc's output to see if there is any error messages about why it died. 
 You can redirect STAFProc's output to a file when you start STAFProc on 
afw2008r2a to make sure STAFProc's output is saved.  For example: 

   STAFProc > C:\staf\stafproc.out

Let me know what STAFProc's output contains when it has been killed.

--------------------------------------------------------------
Sharon Lucas
IBM Austin,   luc...@us.ibm.com
(512) 286-7313 or Tieline 363-7313




From:   Richard Pitkin/Westford/i...@lotus
To:     staf-users@lists.sourceforge.net
Date:   12/09/2010 09:15 AM
Subject:        [staf-users] How to collect data on failure



Hi, 
We have a system that uses staf to run client/server testing.  We had a 
failure where during a multi-hour run, staf appeared to just quit.  In 
particular we received the following output: 

13393   Error   20101209-02:13:26   afclient34....runTestSuite 
getOtherOSType: Failure looking for OSName using the 'STAF VAR' service on 
target machine 'afw2008r2a...'! RC=16 
STAFResult=STAFConnectionProviderConnect: Error performing test read on 
connected endpoint: recv() RC=111: 22, Endpoint: tcp://afw2008r2a...   
13394   Fail    20101209-02:13:26   afclient34....runTestSuite 
getOtherOSType: Terminating function. Error: Failure looking for OSName 
using the 'STAF VAR' service on target machine 'afw2008r2a....'! RC=16 
STAFResult=STAFConnectionProviderConnect: Error performing test read on 
connected endpoint: recv() RC=111: 22, Endpoint: tcp://afw2008r2a....   
13395   Error   20101209-02:13:26   killDomino: BTN3: 
Server1(afw2008r2a...): Terminating function - Error: Failure looking for 
OSName using the 'STAF VAR' service on target machine 'afw2008r2a....'! 
RC=16 STAFResult=STAFConnectionProviderConnect: Error performing test read 
on connected endpoint: recv() RC=111: 22, Endpoint: tcp://afw2008r2a.... 

The job started at 20101208-11:58:06 so we know that this worked for quite 
awhile. 

The failing system is W2008r2  and the controller machine of the client 
and servers is Linux. 
The W2008r2 has version: 
staf afw2008r2a misc version 
Response 
-------- 
3.3.4 

The controller is: 
 staf product-la misc version 
Response 
-------- 
3.3.4.1

I checked the staf directories for information to see if there was any 
dumps of the jvm but there did not seem to be anything. 

Suggestions on what logging to enable to try and find out what the root 
issue is that we need to correct would be helpful.  We often have jobs 
that last 24 hours, and this has happened before, so I am looking for a 
process that will be able to gather relevant information to resolve this. 

Thanks in advance, 


"For a successful technology, reality must take precedence over public 
relations, for nature cannot be fooled."  R. P. Feynman, "Report of the 
Presidential Commission on the Space Shuttle Challenger 
Accident"[attachment "smime.p7s" deleted by Sharon Lucas/Austin/IBM] 
------------------------------------------------------------------------------
This SF Dev2Dev email is sponsored by:

WikiLeaks The End of the Free Internet
http://p.sf.net/sfu/therealnews-com
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

------------------------------------------------------------------------------
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

Reply via email to