I believe the below "ACHAP07" member of the MXG Source Library
may be the information you requested, as it was presented at
SHARE in the 70s and 80s.
While it claims to be revised, it's still possibly out of date.
Barry
/* Copyright (C) 1984,1994 Merrill Consultants Dallas Texas USA */
Status: Has been completely revised.
Chapter Seven
Events in the Lifetime of an MVS Job/STC/TSO Session.
To perform CPE analysis, the analyst must understand the computer system
under investigation and know what functions the operating system performs
for a task. Although operating systems are different, they all manage the
shared resources in response to requests by tasks for those resources.
Since the operating system is the manager of the many queues for these
requests, a large part of CPE data analysis is to determine who is in what
queue, for how long, and why.
All operating systems provide some accounting data for each task (job,
session, and so forth) that contain time stamps when certain events occur
for that task. By comparing these time stamps to the logical structure of
the services provided by the operating system, the analyst can understand
how an operating system interaction is mapped to the CPE data. By
subtracting adjacent time stamps, the duration in certain states
(allocation, CPU execution, I/O execution, and so forth) can be determined
at the task level. Documentation of CPE data from the vendor often does not
tell enough about the meaning of these events. Sometimes you can read the
microfiche (when the vendor makes the source available) and perhaps decipher
the conditions under which a time stamp is valid, but even then, the
experimental confirmation of suspicion is required.
By executing tasks that you understand (that is, your own specific job) and
by examining the detail data in those detail records written on your tasks,
you can gain the necessary knowledge. For SMF data, the MXG program ANALALL
will print all variables from all records for any job.
The SAS System provides powerful features for manipulating numeric values
that contain durations, time stamps, dates, and datetimestamps.
All of these variables are stored as numeric values and manipulated as any
other numeric value. It is the association of a SAS "format" (by using the
FORMAT statement) that causes the numeric value to be printed as a date,
time of day or duration, or datetimestamps:
Dates are stored as the number of days (plus or minus) from Jan 1, 1960.
MXG always assigns the format DATE7 to date variables, so that the name of
the month is printed, and MXG avoids the ambiguity of the MMDDYY and DDMMYY
formats (03/05/94 means 05MAR94 in the US but 03MAY94 in Europe!)
Eg: DATA; DATE=1; PUT DATE DATE9.; prints as DATE=01JAN1960.
Eg: DATA; DATE=1; PUT DATE DATE7.; prints as DATE=01JAN60.
Times are stored as seconds (and fractions, if any), and are assigned the
TIME12.2 format (for typical SMF time stamps with resolution of tenths and
hundredths of a second).
Eg: DATA; TIME=43200.99; PUT TIME TIME12.2; prints as 12:00:00.99,
(there are 86,400 seconds in a day!). .
Datetimestamps are stored as seconds (plus or minus) from midnight on
January 1, 1960, and are assigned the DATETIME21.2 format (if they are
standard SMF datetimestamp values).
Eg.: DATA; DATETIME=1262217600.99; PUT DATETIME DATETIME21.2;
prints as 31DEC1999:00:00:00.99.
A note on the length of datetime variables. SAS numeric variables
are floating point numbers, and are normally stored in 8 bytes (one
for the exponent, seven for the mantissa).
Prior to Change 19.272 in January, 2002 this was true:
MXG overrides the stored length default and specifies LENGTH as
DEFAULT=&MXGLEN; only a few variables need the full 8 byte length,
and using 4 bytes per variable saves significant DASD space
in your MXG datasets. There is one class of variables, however,
that do require 8 bytes of storage: datetimestamp values, because
four bytes is not large enough to fully resolve the 10-digit value
of current datetimestamps. In fact, if you should store a
datetimestamp in only four bytes, the time values will be as much
as 255 seconds less than reality. Thus MXG assigns a length of 8
bytes for all datetimestamp values. To further save storage, MXG
may keep only the start datetime value and the duration, requiring
only 12 bytes, instead of both timestamps which would require 16
bytes, since the end datetime can always be reconstructed by
addition.
Change 19.272 changed the default stored length from 4 to 5 on EBCDIC
and from 4 to 6 on ASCII SAS systems. See that change text.
Originally, all MXG datetimestamp variables ended with ....TIME (for
example, READTIME), and all MXG duration/time variables ended with ...TM
(for example, CPUTCBTM), but that was back when I created all of names of
all of the variables. Now, I frequently use the vendor's field name as the
MXG variable name (since that eliminates any need for you to cross reference
the vendor's name with its MXG name), so you may not be able to tell what a
variable contains by its name. Since the FORMAT of a variable is displayed
by PROC CONTENTS, and is also contained in ADOC members and in DOCVER, it is
a simple matter to determine whether an MXG variable is a date, a time, or a
datetimestamp value.
Time values usually represent durations, although they are occasionally used
for the time of day when the date is not directly available. Since time
values and datetime values are both stored in units of seconds, they can be
added or subtracted. For example, to find the duration between two datetime
values, subtract them. Similarly, the end time of an event can be found by
adding the datetime value at the start of the event and its duration.
Figure 7.1 maps the time stamps that exist in the SAS MVS data sets from
most of the job-related SMF records. The main function of BUILDPDB, the MXG
module that builds the MVS Performance Data Base (PDB), is to collect these
time stamps from their many event records and place them into a single SAS
observation per job. The number in parentheses under each variable name
identifies the SMF record type in which this variable is found, and you can
read the appropriate ADOC for that record for more detail on the actual
meaning and content of the variables listed here.
Figure 7.1
Schematic
Job Level Datetime Stamps
READTIME JCNVTIME JSTRTIME JENDTIME JPRNTIME JFINTIME JPURTIME
|-----|----|--------|----|----------|---------|--------|--------|
RDRTM CONVERTM | |
JINITIME JTRMTIME
|----------|
/ \
/ \
/ \
/ \
/ \
/ \
/ \
INITTIME ALOCTIME LOADTIME TERMTIME
I--------------|----------|--------------|
| DSENQTM | ALOCTM | EXECTM |
|---------------SELAPSTM-----------------|
The following paragraphs describe the event represented by each variable and
identify the source clock used for this event. In multiple CPU
installations, various events on behalf of a job can occur on different
processors, and the time stamp of that event uses the clock on that
processor. The TYPE26 dataset from JES identifies the system on which these
events occurred (SYSREAD, SYSCVRT, SYSEXEC, SYSOUTP, SYSPUR), and the
following descriptions use the same names to identify the source clock.
READING IN THE TASK
The physical card images (JCL) for the job, TSO session, or started task are
read by JES from a reader (real or internal) and stored on the JES SPOOL
data set. The reader-interpreter can generate JCL errors, for the obvious
kinds of errors (bad JOB card, accounting fields invalid, no step records,
etc.) and can terminate the task with a JCL error at the end of read-in, but
most JCL syntax errors are not caught here.
READTIME Datetimestamp when the read-in began (actually, when the
JOB card was recognized on the host that read in the
job). READTIME is the common time stamp that is stored
in all subsequent SMF records created on behalf of this
task (a task can be read in only once.) READTIME uses the
clock on read-in system, variable SYSREAD, which is the
system to which the RJE or NJE line is connected, or the
system in which the internal reader was invoked. Note
that if this job was initially read in at a different NJE
node, the time zone of READTIME will be the time zone of
that NJE system. From dataset TYPE30_5.
RDRTM Duration it took to read in the task, calculated by
subtracting READTIME from RENDTIME, the reader end event.
RENDTIME is not kept by MXG because it can be constructed
with: RENDTIME=READTIME+RDRTM; RENDTIME is timestamped
at the receiving site. If a job is read in at an NJE
site in a different time zone, the value of RDRTM can be
negative (if the reading site is to your west), or can be
hours larger than true reader duration (if the reading
site is to your east). Although RDRTM is normally quite
short for jobs processed by internal readers, it is the
actual transmission time for the physical card images
during the read-in event. From dataset TYPE30_5.
JES CONVERSION
Now, the JCL card images on the JES SPOOL must be processed by the JES
converter into internal text and control blocks, the JES "ITEXT", that are
then rewritten to the SPOOL data set. JES determines the functions (SCAN,
EXECUTE, PRINT, PUNCH) that are required for this task in the converter.
Most JCL syntax errors are detected here, and the converter flushes the job
at JCNVTIME + CONVERTM. JES3 conversion executes on the global processor.
JES2 conversion executes on whichever system has the checkpoint when the job
finishes read-in.
JCNVTIME datetimestamp when the converter began processing this
task. Because this normally occurs immediately following
the completion read-in, MXG does not create a variable
for the duration from reader-end to converter start.
JCNVTIME uses the clock on the system which converted the
job (SYSCVRT). From dataset TYPE26.
CONVERTM the duration required to convert the job control language
statements. From dataset TYPE26.
JES3-only: main device scheduling (MDS)
For JES3, mountable devices can be allocated first. Then only after all
required devices are available is the task eligible for execution
scheduling. When the installation chooses this option, events in device
scheduling are reported only in the JES3 type 25 record. In my opinion you
should not use MDS. It is an archaic option from the days of MVT that is
guaranteed to require many more tape drives that are actually needed. This
is especially true if you have a 42 step job that requires a tape drive only
in the last step. With JES3 MDS, the tape drive is allocated, the job waits
for an initiator, then begins execution, and only after 41 steps does the
tape drive (which has been allocated to this job, and thus not available to
anyone else) get used! From TYPE25.
FETCTIME datetimestamp when JES3 MDS requested that the tape
volume be fetched from the library.
STRTTIME datetimestamp when operators issued the start set-up that
the needed volumes had been fetched.
MNTTIME datetimestamp when first mount was issued by JES3 MDS.
VERFTIME datetimestamp when first mountable volume was verified
(that is, the mount was completed and the correct volume
serial was verified).
JES execution processor
After the JES conversion (and JES3 MDS, if any), the task is now in the JES
execution queue (unless the task was read into the HOLD queue), and the job
will be considered for execution. The job waits in the execution queue
until it is the highest priority job in a job class for which there is an
idle initiator. Execution begins when JES issues a JOB STARTED message and
passes the job's control blocks from the SPOOL to the idle MVS initiator.
MVS then initiates the job, allocates devices needed by the job, and
executes the steps of the job. When the last step is complete, when a step
abends and terminates the job, or when the job is canceled by an MVS cancel
command, MVS terminates the job and returns the job to JES with a JOB ENDED
message. All time stamps use the clock on the execution CPU (SYSEXEC).
JSTRTIME job start datetimestamp, when JES passed the job to the MVS
initiator. This is printed on the job's SYSMSG log, "JOB
STARTED AT". For jobs which are cancelled and then restarted
(for example, due to Data Set Enque Conflicts), this will be
the time of the last start of execution; varible RESTARTS is
greater than 1 for restarted jobs. From TYPE26J2/TYPE26J3.
JINITIME job initiation datetimestamp on the MVS execution system.
JSTRTIME and JINITIME will normally be close in time, but for
restarted jobs, MXG chooses to keep the time when the first
initiation started. That way, you can measure when you (the
data center) were able to initiate the job. Any delay due to
restart can be blamed on the user instead of the data center,
and thus JINITIME should be used as the time of initiation
for job scheduling purposes. From the first TYPE30_1 for this
job.
JRSTTIME for restarted jobs, the datetimestamp when the operator first
cancelled and requeued the job. The duration between
JINITIME and JRSTTIME is how long it took for the operator to
decide to cancel and restart the job. From the JTRMTIME of
the first TYPE30_5 for this job.
JINLTIME for restarted jobs, job initiation datetimestamp of the last
TYPE30_1 for the job, if the job was restarted. The duration
between the JINITIME and JINLTIME is the delay to the job due
to the restart(s). From last TYPE30_1.
JTRMTIME job termination datetimestamp on the MVS execution processor,
(for the last execution if the job was restarted). From the
last TYPE30_5 for this job.
JENDTIME job end datetimestamp, when JES received the job back from
MVS. JTRMTIME and JENDTIME should be essentially the same.
If a non-restarted job was active when the MVS system
crashed, this time will be missing, since MVS did not
complete execution of the job. For a restarted job that was
active when the MVS system crashed, this will be non-missing,
and will have the time of the last restart, which will be
earlier than the JSTRTIME. MXG identifies obs that were
active when the system crashed, because the TYPE26 timestamps
are stored on JES spool and are not erased when MVS crashes,
and sets ABEND='CRSH';. From TYPE26J2/TYPE26J3.
JELAPSTM duration from JINITIME to JTRMTIME (or JINLTIME to JTRMTIME,
for a restarted job, where JINLTIME is the last initiate of a
restarted job, and JTRMTIME is the termination on z/OS of the
last execution). This is the elapsed time for a job, session
duration for a TSO session, or the "up time" for a started
task.
SELAPSTM sum of the step elapsed duration (INITTIME to TERMTIME) for
all steps, including restarts.
MVS step initiation process
Figure 7.2
Schematic
Step Level Datetime Stamps
Batch Job reading Three Tape Volumes
SELAPSTM
|-----------------------------------------------------------------|
| |
INITTIME ALOCTIME LOADTIME ENDTIME
|---------|---------|---------------------------------------------|
DSENQTM ALOCTM / EXECTM |
/ |
/ |
/ |
/ |
/ |
/ EXECTM |
/ 628.88 |
|--------------|---------------------------------------|
| | |
| ACTDLYTM | ACTIVETM |
| 4.23 | 624.65 |
|--------------|---------------------------------------|
| |
RESDLYTM=0.0 | RESIDTM |
(Because task | 624.65 |
did not swap) |---------------------------------------|
| |
| DSPDLYTM | CPUTM |
Note that CPITCBTM+CPISRBTM | 610.60 | 14.11 |
included in CPUTM but were |--------------------------|------------|
not used in calculation of Tape Mount Pending / |
DSPDLYTM because they occur duration was 248.13 /
outside EXECTM. (measured by ASMTAPES) /
/
|<--------------->|<-------------------------------/---------->|
| | |
| | |
|CPITCBTM CPISRBTM|CPUTCBTM CPUSRBTM CPUHPTTM CPUIIPTM CPURCTTM|
| 0.06 0.00 | 12.36 1.51 0.00 0.17 0.01 |
|--------|--------|--------|--------|--------|--------|--------|
Figure 7.3
Schematic
Step Level Datetime Stamps
A long (full day) TSO session
SELAPSTM
|-----------------------------------------------------------------|
| |
INITTIME ALOCTIME LOADTIME ENDTIME
|---------|---------|---------------------------------------------|
DSENQTM ALOCTM / EXECTM |
/ |
/ |
/ |
/ |
/ |
/ EXECTM |
/ 14:29:37.42 |
|------------------------------------------------------|
| |
| ACTDLYTM ACTIVETM |
| 14:16:57.90 759.52 |
|--------------|---------------------------------------|
| |
| RESDLYTM RESIDTM |
| 04.22 755.30 |
|------------|--------------------------|
Session TGETS=1275 | |
TSO/MON TRANS=1276 |DSPDLYTM CPUTM|
| 654.13 101.17|
|--------------|-----------|
(Additional schematics of TYPE30/STEPS data are in ACHAP30.) (Schematics
of TYPE70 CPU relationships are in ACHAP10.)
A job consists of one or more steps. Job initiation begins at JINITIME when
MVS receives the job from JES, and then MVS initiates the first step at
INITTIME. The first step of every job begins by issuing the data set
enqueue (DSENQ) request for every data set name in all steps of the job.
(JES3 has already issued its enqueue request earlier, before initiation, so
it will always find its data set names available here).
But in JES2, the DSENQ request is made here for the first time.
If the data set name is not available for this job (because this request is
for DISP= OLD, NEW, or MOD, and the data set is being used, or when this
request is for DISP=SHR but some other job/user has the dataset name with
DISP=OLD,NEW, or MOD), this job enters "Data Set ENQUE Delay"
and the operator is notified of the conflict with a console message:
"FOLLOWING DATA SETS RESERVED OR UNAVAILABLE."
If the operator takes no action, the job waits unproductively in execution,
tying up an initiator but using no other resources, until the job or TSO
session using the dataset terminates and frees the dataset.
If the operator cancels the job (actually, the operator must issue three JES
commands; $HJ to hold the job, $EJ to requeue the job, and then $CJ to
cancel the job), MVS initiation is terminated, and both a TYPE30_4 step
record and TYPE30_5 job record are written (both indicate SYSTEM ABEND 222),
and the job is returned to the JES2 execution queue in HOLD status until the
operator releases the job. (More commonly, an automated operator program
detects the delay, automates the cancel, and sets up an ECB (Event Control
Block) against the dataset, and when the ECB is posted, the automated
operator issues the $AJ command to release the job from hold for restart.)
The duration between INITTIME and ALOCTIME is named DSENQTM because that is
the primary contributor to that duration. By examining jobs with large
DSENQTM values, you can identify which jobs are not playing the DISP= game
correctly. You can further examine the TYPE77 dataset to see which jobs
held and which jobs wanted which data sets, and then educate the users to
either use DISP=SHR, or to put the two contending jobs together as separate
steps of one job to avoid the delay
Once the data set names are acquired by DSENQ, step allocation commences at
ALOCTIME. During allocation, all devices required by this step are acquired
one at a time, in the order of the DD statements for the step.
Allocation is the process that connects the DDNAME that your program will
use to the actual physical location of the dataset.
For an existing DASD data set, allocation finds the UCB address of the
volume, and reads the VTOC to find the cylinder, head, and record number of
the start of the data set. If the request is for a new DASD data set, the
allocator selects a volume serial (here is where SMS ACS rules are executed
to choose which VOLSER is selected) and finds free space and creates the
VTOC entry for the new data set. Thus the time to allocate a new DASD data
set is much longer than the time to allocate an existing DASD data set.
If a tape drive is needed and one is available, the allocator assigns a
specific tape drive address to your job. If all tape drives that are
currently online are already in use by other jobs, the job enters
"Allocation Recovery", and the operator is notified by console message:
UNABLE TO ALLOCATE 1 DRIVE. THESE DRIVES ARE OFFLINE: 181, 182
REPLY DEVICE NAME, WAIT, OR CANCEL
If one of the offline drives can be used, the operator can reply with the
address of that drive, and the allocator will vary the drive online and
assign it to the job, but that is not normally the case. Usually, the
drives listed as offline are either broken or do not yet exist, so the
operator can either reply CANCEL (causing the job to terminate with a SYSTEM
ABEND 222, pissing off the user!), or, instead the operator will reply WAIT,
which causes the job to wait until a tape drive becomes available. When the
operator replies WAIT, he is further prompted to reply HOLD or NOHOLD - do
you want the job to hold the allocations that it has already acquired, or
should all of the devices that the job has allocated been freed, putting the
job back at the bottom of the queue.
Most sites reply HOLD. The job then waits until a tape drive is free.
Allocation is complete when all needed devices have been given to the job.
Mounts for the first tape volume on each tape drive are issued at the end of
allocation, but allocation does not wait for tape mount completion.
The delay between ALOCTIME and LOADTIME is called ALOCTM, and it is used to
measure how long jobs, and which jobs, are delayed because they could not
get a tape drive when needed. See the analysis in ACHAPzz.
When allocation for this step is complete, the program to be executed is
loaded into memory by "Program FETCH"; the datetimestamp of program load is
stored in variable LOADTIME. Note that although the job has been in
initiation since JSTRTIME, and this step has been in initiation since
INITTIME, it is only now, after LOADTIME, that your program can execute its
instructions, use CPU time, and do I/O. When this program completes its
work (or ABENDs!), the termination datetimestamp is stored in variable
TERMTIME. If the job has additional steps, they INITTIME, ALOCTIME,
LOADTIME, and TERMTIME individually until all steps have run.
When the final step terminates, control is given back to MVS for job
termination, which is recorded in datetimestamp JTRMTIME. There can be some
duration between the last step's TERMTIME and the job's JTRMTIME for
cataloging data sets as well as for DD consolidation in building the job's
TYPE30_5 from the step records (see the discussion in ACHAP3 under
DDCONS(NO)), but most of the time there is little delay between TERMTIME and
JTRMTIME.
INITTIME datetimestamp of the initiation of a step. From TYPE30_4.
DSENQTM duration from INITTIME to ALOCTIME. Measures any delay for
data set enqueue in the first step, usually very small for
subsequent steps. From TYPE30_4.
ALOCTIME datetimestamp of the beginning of MVS device allocation time.
From TYPE30_4.
ALOCTM duration from ALOCTIME to LOADTIME. Measures length of time
the job step took to allocate, including delay for drives.
LOADTIME datetimestamp of the completion of allocation and beginning
of problem program execution for this step.
TERMTIME datetimestamp of MVS step termination.
EXECTM duration from LOADTIME to TERMTIME, the elapsed duration
during which the step could have executed its program's
instructions. It is during EXECTM that the step consumes
CPU and performs I/O operations and occupies memory.
SELAPSTM duration from INITTIME to TERMTIME, the elapsed duration of
the step. Step elapsed time is the sum of DSENQTM, ALOCTM,
and EXECTM, which is how long the step owned its initiator.
ACTIVETM duration when the SRM selected this task to be active.
ACTDLYTM duration when the task was executing but was not active,
calculated as EXECTM minus ACTIVETM. This duration includes
time the address space swapped out due to Detected Wait or
Long Wait Swaps (which includes think time for TSO users),
and any time spent waiting for mount pending.
RESIDTM duration, measured by the SRM, that the task was in real
memory.
RESDLYTM duration when the task was active but was not resident,
calculated as ACTIVETM minus RESIDTM. This delta is the time
SRM wanted the ASID to be in the Multi-Programming Set, but
was unable to get the ASID resident. This includes all
swapped out time (except DW/LW, which is caught in ACTDLYTM)
plus the time to actually do the swap in. Another name might
be MPL Delay time, because all of the swap out and
non-resident time while active is the time when the Target
MPL was not high enough to include this task. This delta
divided by SWAPS is an estimate of the time-to-swap-in a
task, but since SWAPS counts both RESDLYTM and ACTDLYTM swap
events, and since RESDLYTM includes the time to swap in and
the time waiting to swapin and the time swapped out, the
estimate is usually poor!
DSPDLYTM Duration when the task was resident but not dispatched,
calculated as RESIDTM-CPUTCB+SRB+HPT+RCT+IPT. This duration
is the time the task was resident in memory but was not
executing instructions. It includes the delay for CPU
dispatch (when you enter the MPL set you don't immediately
get CPU - a hot CICS transaction may still have the
processor, so you wait here until dispatched on a CPU), and
the delay for I/O (you execute a few CPU instructions and
then issue an I/O and you wait for IOS to get the data for
you), and the delay due to page faults (the next page is not
in memory and you wait for that page to be brought in). Note
that the two other step CPU times, CPITCBTM and CPISRBTM are
not included in the subtraction, because that times occur
before or after EXECTM.
CPUTM sum of CPUTCBTM,CPUSRBTM,CPUIIPTM,CPUHPTTM,CPURCTTM,
CPITCBTM, and CPISRBTM, the total CPU time consumed by this
task, but note that DSPDLYTM only subtracts five of the seven
CPU timings from RESIDTM.
JES Output Processor
After the job has executed (i.e., the JES execution phase has completed)
there may be JES output processing required, if the task created any SYSOUT
(printing or punching). The JES Output Processor does not print or punch
anything; it is a brief phase from JPRNTIME to JFINTIME, during which JES
builds the print control blocks, called JOEs (job output element). Each JOE
is the collection of all print files with the same destination and form, and
each JOE becomes a separate element in the JES print queue. For each JOE
that is actually printed, there will be a separate type 6 record created,
containing PRINTIME, the datetimestamp when print lines were sent to the
printer device, and PRENTIME, the end of transmission of print lines to the
printer device. The duration from JFINTIME to PRINTIME is PRQUETM, the
print queue time for that print file, while the duration from PRINTIME to
PRENTIME is PRINTTM, the duration when print lines were being sent to the
printer. (If the print device is a Print Workstation (like a Xerox 9700),
which receives and buffers print lines for later printing, PRINTTM will not
represent the actual time for printing). See ACHAPTzz for more details on
measuring printer delays, activity, and utilization.
JPRNTIME datetimestamp of the start of the JES Output Processor.
Uses the clock of whichever JES system executed the
Output Process (SYSOUTP). From TYPE26J2/TYPE26J3.
JFINTIME datetimestamp of the end of the JES Output Processor.
PRINTIME datetimestamp of the start of sending print lines to the
printer device. Uses the clock on the system to which
the print device is connected (for a local printer), or
the system to which the JES RJE line serving this printer
is connected. From TYPE6 dataset.
PRENTIME datetimestamp of the end of sending print lines to the
printer device. For 3800-class printers, PRENTIME is not
the end of printing, but instead it is the time when the
last page of the job has been physically moved to the
stacker page.
JES purge processor
When JES has completed all printing or punching for this task (or if the
task is purged with the $PURGE command), all traces of this job are removed
from the SPOOL and the JES job number is available for assignment to a new
job. JES purge processor uses the clock on the CPU that purges the job.
JPURTIME datetimestamp of the purge event.
The preceding variables provide an awesome opportunity for analysis.
Understanding the sequence of events and the points in that sequence that
are available in the operating system's task accounting records
significantly eases the CPE analyst's task. However, it is necessary to
understand the limitations and potential errors that can occur in the time
stamps and durations. There are numerous exposures in these
software-captured values from the CPU clock(s).
The multiple system environment probably causes the most confusion. A
multiple operating system environment is often referred to as loosely
coupled to indicate that processors are coupled by the JES SPOOL rather than
tightly coupled or multiple engine processors that are coupled by sharing
one operating system and one CSTOR/ESTORE memory. In such an environment,
it is possible for a job to be read in on one system, executed on another,
printed on a third, and so forth. Only with IBM's SYSPLEX timer can these
multiple systems' clocks be synchronized.
Otherwise, each system's clock is set by human operators comparing
wristwatches and pressing the SET CLOCK switch (or its equivalent) on the
hardware. If clocks for the two systems are not synchronized, jobs can
start before they were read in, execute in negative time, and so forth.
Where there is no hardware solution to synchronization, the operators must
understand the importance of accuracy when setting the system clock. Some
sites have even implemented these circumventions:
modify the JES START command (wrap it with a program) that compares
the checkpoint timestamp with the oncoming system's clock when a new
processor is brought into the complex. Do not allow a processor to
enter the complex if its clock is not within some predetermined
value, usually 15 to 30 seconds.
require operations supervisors to issue DT (Display Time) commands on
each system and to record the delta time between the clocks as a part
of the shift turnover procedures, and create competition between
crews to see how closely they can set the clock.
if an operator bonus program based on performance exists, negate the
bonus if a crew fails to synchronize the clocks adequately
Restarted jobs in a JES2 environment also cause confusion. For example,
when a job enters data set enqueue conflict, it is cancelled and placed on
hold until the job owning the data set finishes, at which time the job is
manually released by the operator and enters execution. This is all one job
to JES2, but as both step and job termination records were written when the
job was cancelled, there appear to have been two executions in SMF. There
is only one execution purge record, and the JES execution processor time
stamps JSTRTIME and JENDTIME contain the start and end time of the last
execution.
Although BUILDPDB resolves this possibility in its JOBS data set by counting
the number of job termination records (RESTARTS) and keeping additional
timestamps, many other accounting and performance systems falsely report the
count of job termination records as the number of jobs processed.
A final warning about these time stamps concerns how the operating system
builds them. The images for some SMF records, especially type 26, are
initialized with hexadecimal zeros, and time stamps are filled in as events
occur. If an event never occurs, then the time stamp remains hexadecimal
zeros in the written record. By design, the SAS
SMFSTAMP8 format recognizes all nulls and sets the variable's value to a
missing value. If you then subtract the missing value variable from a
non-missing value variable, the result is now missing.
READTIME is the only variable guaranteed to be nonmissing.
JCNVTIME can be missing for a job placed on hold that is canceled
before conversion.
JSTRTIME is missing for all JCL errors detected in the reader or in
the converter, an for any job canceled before execution. You can
identify that the job was cancelled by CANCELED='Y' in PDB.JOBS.
JENDTIME is missing for any job with a missing JSTRTIME and, more
importantly, for any job active when the operating system crashes.
Furthermore, JENDTIME can be nonmissing and less than JSTRTIME if a
job executed was restarted and the system crashed during the restart.
JPRNTIME and JFINTIME can be missing for JCL errors.
PRINTIME and PRENTIME are missing for any job that does not produce
printed output.
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN