Bodo,
When a <try> element with a <finally> element is encountered, the
<finally> element is added to the call stack before the <try> element (as
part of the code that ensures that the finally element is always run),
unlike any other STAX element. So, just because the <finally> element is
on the top of the call stack doesn't mean that the hang necessarily
occurred within the finally. Something strange is happening though where
the finally element is not being removed from the call stack.
It would be helpful if you added some more <log> elements to debug this
problem (you don't have to send these to the STAX Monitor, just log them
in the STAX Job User Log). For example, to know if the <finally> element
started execution, add a <log> as the first task in the finally element so
that even if MyProcessHandle is 0, you'll know if the finally element
started execution. Also, to know if the <finally> element completed, add
a <log> as the last task in the finally element. For example:
<finally>
<sequence>
<log>"Entering Finally block"</log>
<if expr="MyProcessHandle != 0">
<sequence>
<log message="True">
" Interaction '%s': Signal '%s' sent due to User
Abort" % \
(Interaction['Name'], Interaction['AbortSignal']) </log>
...
<log message="True">
" Signalled interaction is gone"</log>
</sequence>
</if>
<log>"Exiting Finally block"</log>
</sequence>
</finally>
</try>
Even though the process is no longer running (as you verified via the ps
command), you need to verify if both STAF and STAX have been notified that
the process is no longer running. You can do this as follows:
1) To see if STAF knows that the process is no longer running:
STAF processMachine PROCESS LIST HANDLES LONG
Is the process handle still in the list? If it's not in the list, then
STAF knows the process has completed and its process completion
information has been freed. If the process handle is in the list, if its
"End Date-Time" and "Return Code" fields contain a value other than
<None>, then the process has completed but its process completion
infiormation has not yet been freed.
2) To see if STAX knows that the process is no longer running:
STAF staxMachine STAX LIST JOB 15 PROCESSES
Is the process handle in the list? If it's in the list, then STAX has not
been notified (or did not receive the notification) that the process has
completed.
Also, as a side note, why do you have a <try>/<finally> where the <try>
element contains <nop/> like as follows? It doesn't really make any sense
to do this as the purpose of the finally element is to ensure that the
finally element's task is executed, no matter whether the tr y task
completes normally or abnormally. Since a <nop/> element does nothing
(e.g. no operation), then it's can't fail, so it doesn't make sense to do
that. You should change this as follows:
Change:
<try>
<nop/>
<finally>
<sequence>
...
</sequence>
</finally>
</try>
to:
<sequence>
...
</sequence>
Let me know when you have an easier recreation scenario (e.g. one that I
could run on my STAX machine to recreate the problem and debug it). That's
going to be the most likely way that this problem will be resolved.
--------------------------------------------------------------
Sharon Lucas
IBM Austin, luc...@us.ibm.com
(512) 286-7313 or Tieline 363-7313
Strösser, Bodo <bodo.stroes...@ts.fujitsu.com>
08/04/2009 10:38 AM
To
Sharon Lucas/Austin/i...@ibmus
cc
"'staf-users@lists.sourceforge.net'" <staf-users@lists.sourceforge.net>
Subject
RE: [staf-users] STAX Job hangs
Sharon,
for Job 7 it's exactly what I've sent, the part starting on 20090803 up to
20090804
10:53. On 10:53 the hang imlicitly was released by STAF shutdown.
Those logs of JOB 7 show the last started <process> being completed when
the
job stopped.
Currently again a job is hanging (10 Threads). The logs are appended.
This time, the logs say that the <process> still is running. But that
isn't true.
The process is gone (ps command) and also no longer is displayed by
STAXMon (no gearwheel).
STAF local STAX QUERY JOB 15 THREAD 1 says, that the job hangs inside of
the <finally> on line 1923, as it did when I mailed the first time. So,
the script
simply didn't reach the line, where the completion message for the
<process> is
logged.
This time I didn't terminate any block, so the <process> in <try> came to
its normal end and the following <script> must have resetted
MyProcessHandle
to 0. Thus, the <if> on line 1924 must be false and all the content of the
<finally>
must be skipped. How can it hang in an empty <finally>?
The only thing that is common to all hangs I've looked into is a <finally>
on top of
the stack.
The STAX job still hangs and I can connect jdb to the JVM. If you have
more
experience using jdb, maybe you could tell me how to get more info from
it.
Bodo
BTW: I'll try to strip off my script to have an easy way to recreate the
problem.
But that might take a lot of time. If there is a chance to catch the
problem using
the current script, it would be better for me.
From: Sharon Lucas [mailto:luc...@us.ibm.com]
Sent: Tuesday, August 04, 2009 4:43 PM
To: Strösser, Bodo
Cc: 'staf-users@lists.sourceforge.net'
Subject: Re: [staf-users] STAX Job hangs
Bodo,
What are the contents of the STAX Job Log and the STAX Job User Log when
this job hangs?
--------------------------------------------------------------
Sharon Lucas
IBM Austin, luc...@us.ibm.com
(512) 286-7313 or Tieline 363-7313[attachment "Job_15_User.log" deleted by
Sharon Lucas/Austin/IBM] [attachment "Job_15.log" deleted by Sharon
Lucas/Austin/IBM]
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users