Bodo,

Were there any errors in the STAX JVM log?
What version of STAX and what version of STAF are you running?

You're doing several things in your <finally> element that we don't 
recommend and can cause you not to be able to stop your STAX job. 

1) First, in the STAX User's Guide (in the documentation for the <finally> 
element) it says:

"Note that if you want to have a guaranteed way to stop a finally task, 
you should have the first element contained in your finally task be a 
block or timer element. For example, if you submit a request to terminate 
the job, it will not terminate the job until the finally task(s) complete. 
But if you submit a request to terminate a block that is currently running 
which is contained within a finally task, then the block will be 
terminated (it will not wait until that finally task completes)."

So, you should change your <finally> element to contain a <block> as its 
first task (and possible a <timer> element if you know that this process 
should be able to be freed within a specified duration)  For example:

  <finally>
     <block name="'FinallyCleanupBlock'"> 

       <if expr="MyProcessHandle != 0">
         ....
       </if>

    </block>
  </finally>

or, if you know this process should always complete within 30 minutes (or 
whatever value), you can also specify a timer element.

  <finally>
     <block name="'FinallyCleanupBlock'">
       <timer duration="'30m'"> 

         <if expr="MyProcessHandle != 0">
           ....
         </if>

       </timer>
    </block>
  </finally>


2) You should be using a <loop> element instead of  a <script> element 
because STAX cannot stop an infinite loop in Python code, but it can stop 
a <loop> element.  Also, your loop is using up a lot of CPU as it only 
waits 0.1 seconds before it repeats itself.  You should have a longer wait 
interval such as 10 seconds or more depending on how long it takes your 
process to complete.  If it takes a long time, then your wait interval 
should be longer.  If you have a long wait interval, then you could use a 
<stafcmd> to submit a local DELAY request to the DELAY service instead of 
using the Python time.sleep() because STAX cannot stop a Python 
time.sleep() if you wanted to terminate the finally block.
 
3) What does EMACHTools.STAFSubmit2() do?  Is it using a <stafcmd> to 
submit a STAF service request?  A STAX job should only use a <stafcmd> 
element to submit a STAF service request. 

So, here's what I think your <finally> block should look like (based on 
what little I know that you're trying to do).  You should also see if your 
other <finally> blocks need to be updated for the reasons I talked about 
above.

   <finally>
     <block name="'FinallyCleanupBlock'">
       <if expr="MyProcessHandle != 0">
         <sequence>

           <log message="1">
             "        Interaction '%s': Signal '%s' sent due to User 
Abort" % \
             (Interaction['Name'], Interaction['AbortSignal'])
           </log>
           <log message="1">
             "        Waiting for signalled interaction to exit"
           </log>

           <script>done = 0</script>

           <loop while="not done">

             <sequence>

               <stafcmd name="'Free process handle %s' % 
(MyProcessHandle)">
                 <location>'local'</location>
                 <service>'PROCESS'</service>
                 <request>'FREE HANDLE %s' % (MyProcessHandle)</request>
               </stafcmd>

               <if expr="RC == STAFRC.Ok or RC == 
STAFRC.HandleDoesNotExist">
                 <script>done = 1</script>
                 <elseif expr="RC == STAFRC.ProcessNotComplete">
                   <stafcmd name="'Delay for 10 seconds while waiting for 
process to end'">
                     <location>'local'</location>
                     <service>'DELAY'</service> 
                     <request>'DELAY 10s'</request>
                   </stafcmd>
                 </elseif>
                 <else>
                   <script>
                     FatalMsg = 'PROCESS FREE HANDLE %s failed with RC=%s' 
% (MyProcessHandle, RC) 
                     done = 1
                   </script>
                 </else>
               </if>

             </sequence>
           </loop>

           <call function="'EMACH_CheckError'"/>
 
           <log message="1">
             "        Signalled interaction is gone"
           </log>
 
         </sequence>
       </if>
    </block>
  </finally>

I don't know why the <finally> element is the last element shown in the 
call stack.  If it the STAX job was stuck in the infinite loop in the 
<script> element, then I would have expected the <script> element to be 
the last element shown in the call stack.  Perhaps there's another problem 
in your STAX job.  But, you should first update your finally element(s) as 
I recommended and see if that resolves the problem.
 
--------------------------------------------------------------
Sharon Lucas
IBM Austin,   luc...@us.ibm.com
(512) 286-7313 or Tieline 363-7313




Strösser, Bodo <bodo.stroes...@ts.fujitsu.com> 
07/30/2009 09:57 AM

To
"staf-users@lists.sourceforge.net" <staf-users@lists.sourceforge.net>
cc

Subject
[staf-users] STAX Job hangs






Hi,
 
today my STAX-Job hanged up. This is the result of a thread query:
 
# staf local stax query job 14 thread 1
Response
--------
{
  Thread ID      : 1
  Parent TID     : <None>
  Start Date-Time: 20090730-16:04:03
  Call Stack     : [
    function: EMACH_main (Line: 804, File: /home/EMACH/EMACH-stax.xml, 
Machine: local://local)
    sequence: 12/12 (Line: 865, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    block: main.Test execution (Line: 1076, File: 
/home/EMACH/EMACH-stax.xml, Machine: local://local)
    sequence: 1/1 (Line: 1077, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    finally (Line: 1154, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    try (Line: 1079, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    sequence: 10/10 (Line: 1080, File: /home/EMACH/EMACH-stax.xml, 
Machine: local://local)
    iterate: 1/1 {'Name': '2.1.1 CSTA_base_all', 'Tes... (Line: 1124, 
File: /home/EMACH/EMACH-stax.xml, Machine: local://local)
    sequence: 1/2 (Line: 1125, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    block: main.Test execution.2:1:1 CSTA_base_all (Line: 1127, File: 
/home/EMACH/EMACH-stax.xml, Machine: local://local)
    sequence: 3/4 (Line: 1128, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    function: EMACH_ProcessTestCases (Line: 1331, File: 
/home/EMACH/EMACH-stax.xml, Machine: local://local)
    sequence: 1/1 (Line: 1338, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    finally (Line: 1493, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    try (Line: 1340, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    iterate: 1/1 {'Name': 'TC_1', 'SUT': 'PINGUIN', '... (Line: 1342, 
File: /home/EMACH/EMACH-stax.xml, Machine: local://local)
    sequence: 1/2 (Line: 1343, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    block: main.Test execution.2:1:1 CSTA_base_all.TC_1 / PINGUIN (Line: 
1346, File: /home/EMACH/EMACH-stax.xml, Machine: local://local)
    sequence: 1/2 (Line: 1347, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    finally (Line: 1379, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    try (Line: 1348, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    sequence: 5/5 (Line: 1349, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    function: EMACH_ProcessInteractions (Line: 1575, File: 
/home/EMACH/EMACH-stax.xml, Machine: local://local)
    finally (Line: 1620, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    try (Line: 1582, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    iterate: 1/2 {'Type': 'C', 'ExitPass': '0', 'Outp... (Line: 1583, 
File: /home/EMACH/EMACH-stax.xml, Machine: local://local)
    sequence: 1/3 (Line: 1584, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    if: Interaction['Type'] == 'F' (Line: 1586, File: 
/home/EMACH/EMACH-stax.xml, Machine: local://local)
    function: EMACH_ProcessCaller (Line: 1781, File: 
/home/EMACH/EMACH-stax.xml, Machine: local://local)
    sequence: 1/2 (Line: 1787, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
    finally (Line: 1907, File: /home/EMACH/EMACH-stax.xml, Machine: 
local://local)
  ]
  Condition Stack: []
}
 
 
The part of the job included in the <finally> starting on line 1907 is:
 
      <finally>
        <if expr="MyProcessHandle != 0">
          <sequence>
 
            <log message="True">
              "        Interaction '%s': Signal '%s' sent due to User 
Abort" % \
              (Interaction['Name'], Interaction['AbortSignal']) </log>
            <log message="True">
              "        Waiting for signalled interaction to exit"</log>
 
            <script>
              while True :
                msg = EMACHTools.STAFsubmit2('LOCAL', 'PROCESS', \
                        'FREE HANDLE %s' % MyProcessHandle)
                if msg == None :
                  break
                RC = msg.split("RC=")[1]
                RC = int(RC.split(",")[0])
                if RC == 5 : # RC 5 is 'Handle does not exist'
                  break
                if RC != 12 : # RC 12 is 'Process Not Complete'
                  FatalMsg = msg
                  break
                time.sleep(0.1)
            </script>
            <call function="'EMACH_CheckError'"/>
 
            <log message="True">
              "        Signalled interaction is gone"</log>
 
          </sequence>
        </if>
      </finally>
 
The code is here to wait for a process termination after user has 
terminated a block.
I looked for the process the loop wait for, found it to be done and having 
a RC. So I
released the handle via staf, but that didn't make the job run.
 
 
# staf local process list
Response
--------
H# Command                       Start Date-Time   End Date-Time     RC
-- ----------------------------- ----------------- ----------------- 
----------
38 /home/EMACH/EMACHRemoteHelper 20090727-21:02:12 20090728-13:57:26 129
 
# staf local process free handle 38
Response
--------
 
# staf local process list
Response
--------
 
#
 
 
Thinking about the call trace: why do I see <finally> as the innermost 
element? I would guess
the job is looping in <script>, but shouldn't <if> be the innermost then? 
When looking for
the jvm (ps command) I see its CPU-usage count as fast as real time.
 
Any help to find the problem is welcome.
 
Best Regards
Bodo
 
 
 
 
 
 
 
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 
30-Day 
trial. Simplify your report design, integration and deployment - and focus 
on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

Reply via email to