Re: Life of a JOB

Barry Merrill Mon, 13 Feb 2012 07:52:21 -0800

I believe the below "ACHAP07" member of the MXG Source Library 
may be the information you requested, as it was presented at
SHARE in the 70s and 80s.


While it claims to be revised, it's still possibly out of date.

Barry



/* Copyright (C) 1984,1994 Merrill Consultants Dallas Texas USA */

Status: Has been completely revised.

Chapter Seven
Events in the Lifetime of an MVS Job/STC/TSO Session.

To perform CPE analysis, the analyst must understand the computer system
under investigation and know what functions the operating system performs
for a task.  Although operating systems are different, they all manage the
shared resources in response to requests by tasks for those resources.
Since the operating system is the manager of the many queues for these
requests, a large part of CPE data analysis is to determine who is in what
queue, for how long, and why.

All operating systems provide some accounting data for each task (job,
session, and so forth) that contain time stamps when certain events occur
for that task.  By comparing these time stamps to the logical structure of
the services provided by the operating system, the analyst can understand
how an operating system interaction is mapped to the CPE data.  By
subtracting adjacent time stamps, the duration in certain states
(allocation, CPU execution, I/O execution, and so forth) can be determined
at the task level.  Documentation of CPE data from the vendor often does not
tell enough about the meaning of these events.  Sometimes you can read the
microfiche (when the vendor makes the source available) and perhaps decipher
the conditions under which a time stamp is valid, but even then, the
experimental confirmation of suspicion is required.
By executing tasks that you understand (that is, your own specific job) and
by examining the detail data in those detail records written on your tasks,
you can gain the necessary knowledge.  For SMF data, the MXG program ANALALL
will print all variables from all records for any job.


The SAS System provides powerful features for manipulating numeric values
that contain durations, time stamps, dates, and datetimestamps.
All of these variables are stored as numeric values and manipulated as any
other numeric value.  It is the association of a SAS "format" (by using the
FORMAT statement) that causes the numeric value to be printed as a date,
time of day or duration, or datetimestamps:

Dates are stored as the number of days (plus or minus) from Jan 1, 1960.
MXG always assigns the format DATE7 to date variables, so that the name of
the month is printed, and MXG avoids the ambiguity of the MMDDYY and DDMMYY
formats (03/05/94 means 05MAR94 in the US but 03MAY94 in Europe!)

Eg:    DATA; DATE=1; PUT DATE DATE9.;  prints as  DATE=01JAN1960.
Eg:    DATA; DATE=1; PUT DATE DATE7.;  prints as  DATE=01JAN60.

Times are stored as seconds (and fractions, if any), and are assigned the
TIME12.2  format (for typical SMF time stamps with resolution of tenths and
hundredths of a second).

Eg:  DATA; TIME=43200.99; PUT TIME TIME12.2;   prints as  12:00:00.99,
     (there are 86,400 seconds in a day!).                           .

Datetimestamps are stored as seconds (plus or minus) from midnight on
January 1, 1960, and are assigned the DATETIME21.2 format (if they are
standard SMF datetimestamp values).

Eg.:   DATA; DATETIME=1262217600.99; PUT DATETIME DATETIME21.2;
       prints as   31DEC1999:00:00:00.99.

   A note on the length of datetime variables.  SAS numeric variables
   are floating point numbers, and are normally stored in 8 bytes (one
   for the exponent, seven for the mantissa).
   Prior to Change 19.272 in January, 2002 this was true:
      MXG overrides the stored length default and specifies LENGTH as
      DEFAULT=&MXGLEN; only a few variables need the full 8 byte length,
      and using 4 bytes per variable saves significant DASD space
      in your MXG datasets.  There is one class of variables, however,
      that do require 8 bytes of storage: datetimestamp values, because
      four bytes is not large enough to fully resolve the 10-digit value
      of current datetimestamps.  In fact, if you should store a
      datetimestamp in only four bytes, the time values will be as much
      as 255 seconds less than reality.  Thus MXG assigns a length of 8
      bytes for all datetimestamp values.  To further save storage, MXG
      may keep only the start datetime value and the duration, requiring
      only 12 bytes, instead of both timestamps which would require 16
      bytes, since the end datetime can always be reconstructed by
      addition.
   Change 19.272 changed the default stored length from 4 to 5 on EBCDIC
   and from 4 to 6 on ASCII SAS systems.  See that change text.

Originally, all MXG datetimestamp variables ended with ....TIME (for
example, READTIME), and all MXG duration/time variables ended with ...TM
(for example, CPUTCBTM), but that was back when I created all of names of
all of the variables.  Now, I frequently use the vendor's field name as the
MXG variable name (since that eliminates any need for you to cross reference
the vendor's name with its MXG name), so you may not be able to tell what a
variable contains by its name.  Since the FORMAT of a variable is displayed
by PROC CONTENTS, and is also contained in ADOC members and in DOCVER, it is
a simple matter to determine whether an MXG variable is a date, a time, or a
datetimestamp value.

Time values usually represent durations, although they are occasionally used
for the time of day when the date is not directly available.  Since time
values and datetime values are both stored in units of seconds, they can be
added or subtracted.  For example, to find the duration between two datetime
values, subtract them.  Similarly, the end time of an event can be found by
adding the datetime value at the start of the event and its duration.

Figure 7.1 maps the time stamps that exist in the SAS MVS data sets from
most of the job-related SMF records.  The main function of BUILDPDB, the MXG
module that builds the MVS Performance Data Base (PDB), is to collect these
time stamps from their many event records and place them into a single SAS
observation per job.  The number in parentheses under each variable name
identifies the SMF record type in which this variable is found, and you can
read the appropriate ADOC for that record for more detail on the actual
meaning and content of the variables listed here.

                             Figure 7.1
                              Schematic
                      Job Level Datetime Stamps


READTIME   JCNVTIME      JSTRTIME   JENDTIME  JPRNTIME JFINTIME JPURTIME
   |-----|----|--------|----|----------|---------|--------|--------|
    RDRTM      CONVERTM     |          |
                         JINITIME  JTRMTIME
                            |----------|
                           /            \
                         /                \
                       /                    \
                     /                        \
                   /                            \
                 /                                \
               /                                    \
            INITTIME     ALOCTIME   LOADTIME    TERMTIME
             I--------------|----------|--------------|
             |   DSENQTM    |  ALOCTM  |    EXECTM    |
             |---------------SELAPSTM-----------------|

The following paragraphs describe the event represented by each variable and
identify the source clock used for this event.  In multiple CPU
installations, various events on behalf of a job can occur on different
processors, and the time stamp of that event uses the clock on that
processor.  The TYPE26 dataset from JES identifies the system on which these
events occurred (SYSREAD, SYSCVRT, SYSEXEC, SYSOUTP, SYSPUR), and the
following descriptions use the same names to identify the source clock.

READING IN THE TASK

The physical card images (JCL) for the job, TSO session, or started task are
read by JES from a reader (real or internal) and stored on the JES SPOOL
data set.  The reader-interpreter can generate JCL errors, for the obvious
kinds of errors (bad JOB card, accounting fields invalid, no step records,
etc.) and can terminate the task with a JCL error at the end of read-in, but
most JCL syntax errors are not caught here.

READTIME       Datetimestamp when the read-in began (actually, when the
               JOB card was recognized on the host that read in the
               job).  READTIME is the common time stamp that is stored
               in all subsequent SMF records created on behalf of this
               task (a task can be read in only once.) READTIME uses the
               clock on read-in system, variable SYSREAD, which is the
               system to which the RJE or NJE line is connected, or the
               system in which the internal reader was invoked.  Note
               that if this job was initially read in at a different NJE
               node, the time zone of READTIME will be the time zone of
               that NJE system.  From dataset TYPE30_5.

RDRTM          Duration it took to read in the task, calculated by
               subtracting READTIME from RENDTIME, the reader end event.
               RENDTIME is not kept by MXG because it can be constructed
               with: RENDTIME=READTIME+RDRTM;   RENDTIME is timestamped
               at the receiving site.  If a job is read in at an NJE
               site in a different time zone, the value of RDRTM can be
               negative (if the reading site is to your west), or can be
               hours larger than true reader duration (if the reading
               site is to your east).  Although RDRTM is normally quite
               short for jobs processed by internal readers, it is the
               actual transmission time for the physical card images
               during the read-in event.  From dataset TYPE30_5.

JES CONVERSION

Now, the JCL card images on the JES SPOOL must be processed by the JES
converter into internal text and control blocks, the JES "ITEXT", that are
then rewritten to the SPOOL data set.  JES determines the functions (SCAN,
EXECUTE, PRINT, PUNCH) that are required for this task in the converter.
Most JCL syntax errors are detected here, and the converter flushes the job
at JCNVTIME + CONVERTM.  JES3 conversion executes on the global processor.
JES2 conversion executes on whichever system has the checkpoint when the job
finishes read-in.

JCNVTIME       datetimestamp when the converter began processing this
               task.  Because this normally occurs immediately following
               the completion read-in, MXG does not create a variable
               for the duration from reader-end to converter start.
               JCNVTIME uses the clock on the system which converted the
               job (SYSCVRT).  From dataset TYPE26.

CONVERTM       the duration required to convert the job control language
               statements.  From dataset TYPE26.

JES3-only: main device scheduling (MDS)

For JES3, mountable devices can be allocated first.  Then only after all
required devices are available is the task eligible for execution
scheduling.  When the installation chooses this option, events in device
scheduling are reported only in the JES3 type 25 record.  In my opinion you
should not use MDS.  It is an archaic option from the days of MVT that is
guaranteed to require many more tape drives that are actually needed.  This
is especially true if you have a 42 step job that requires a tape drive only
in the last step.  With JES3 MDS, the tape drive is allocated, the job waits
for an initiator, then begins execution, and only after 41 steps does the
tape drive (which has been allocated to this job, and thus not available to
anyone else) get used!  From TYPE25.

FETCTIME       datetimestamp when JES3 MDS requested that the tape
               volume be fetched from the library.

STRTTIME       datetimestamp when operators issued the start set-up that
               the needed volumes had been fetched.

MNTTIME        datetimestamp when first mount was issued by JES3 MDS.

VERFTIME       datetimestamp when first mountable volume was verified
               (that is, the mount was completed and the correct volume
               serial was verified).

JES execution processor

After the JES conversion (and JES3 MDS, if any), the task is now in the JES
execution queue (unless the task was read into the HOLD queue), and the job
will be considered for execution.  The job waits in the execution queue
until it is the highest priority job in a job class for which there is an
idle initiator.  Execution begins when JES issues a JOB STARTED message and
passes the job's control blocks from the SPOOL to the idle MVS initiator.
MVS then initiates the job, allocates devices needed by the job, and
executes the steps of the job.  When the last step is complete, when a step
abends and terminates the job, or when the job is canceled by an MVS cancel
command, MVS terminates the job and returns the job to JES with a JOB ENDED
message.  All time stamps use the clock on the execution CPU (SYSEXEC).

JSTRTIME   job start datetimestamp, when JES passed the job to the MVS
           initiator.  This is printed on the job's SYSMSG log, "JOB
           STARTED AT".  For jobs which are cancelled and then restarted
           (for example, due to Data Set Enque Conflicts), this will be
           the time of the last start of execution; varible RESTARTS is
           greater than 1 for restarted jobs.  From TYPE26J2/TYPE26J3.

JINITIME   job initiation datetimestamp on the MVS execution system.
           JSTRTIME and JINITIME will normally be close in time, but for
           restarted jobs, MXG chooses to keep the time when the first
           initiation started.  That way, you can measure when you (the
           data center) were able to initiate the job.  Any delay due to
           restart can be blamed on the user instead of the data center,
           and thus JINITIME should be used as the time of initiation
           for job scheduling purposes. From the first TYPE30_1 for this
           job.

JRSTTIME   for restarted jobs, the datetimestamp when the operator first
           cancelled and requeued the job.  The duration between
           JINITIME and JRSTTIME is how long it took for the operator to
           decide to cancel and restart the job.  From the JTRMTIME of
           the first TYPE30_5 for this job.

JINLTIME   for restarted jobs, job initiation datetimestamp of the last
           TYPE30_1 for the job, if the job was restarted.  The duration
           between the JINITIME and JINLTIME is the delay to the job due
           to the restart(s).  From last TYPE30_1.

JTRMTIME   job termination datetimestamp on the MVS execution processor,
           (for the last execution if the job was restarted). From the
           last TYPE30_5 for this job.

JENDTIME   job end datetimestamp, when JES received the job back from
           MVS. JTRMTIME and JENDTIME should be essentially the same.
           If a non-restarted job was active when the MVS system
           crashed, this time will be missing, since MVS did not
           complete execution of the job.  For a restarted job that was
           active when the MVS system crashed, this will be non-missing,
           and will have the time of the last restart, which will be
           earlier than the JSTRTIME.  MXG identifies obs that were
           active when the system crashed, because the TYPE26 timestamps
           are stored on JES spool and are not erased when MVS crashes,
           and sets ABEND='CRSH';.  From TYPE26J2/TYPE26J3.

JELAPSTM   duration from JINITIME to JTRMTIME (or JINLTIME to JTRMTIME,
           for a restarted job, where JINLTIME is the last initiate of a
           restarted job, and JTRMTIME is the termination on z/OS of the
           last execution).  This is the elapsed time for a job, session
           duration for a TSO session, or the "up time" for a started
           task.

SELAPSTM   sum of the step elapsed duration (INITTIME to TERMTIME) for
           all steps, including restarts.


MVS step initiation process

                             Figure 7.2
                              Schematic
                     Step Level Datetime Stamps
                  Batch Job reading Three Tape Volumes


                              SELAPSTM
    |-----------------------------------------------------------------|
    |                                                                 |
 INITTIME  ALOCTIME    LOADTIME                                 ENDTIME
    |---------|---------|---------------------------------------------|
     DSENQTM   ALOCTM  /                  EXECTM                      |
                      /                                               |
                     /                                                |
                    /                                                 |
                   /                                                  |
                  /                                                   |
                 /                    EXECTM                          |
                /                     628.88                          |
               |--------------|---------------------------------------|
               |              |                                       |
               |    ACTDLYTM  |                ACTIVETM               |
               |     4.23     |                  624.65               |
               |--------------|---------------------------------------|
                              |                                       |
                RESDLYTM=0.0  |                RESIDTM                |
               (Because task  |                  624.65               |
                did not swap) |---------------------------------------|
                              |                          |
                              |  DSPDLYTM                |    CPUTM   |
  Note that CPITCBTM+CPISRBTM |   610.60                 |    14.11   |
  included in CPUTM but were  |--------------------------|------------|
  not used in calculation of     Tape Mount Pending           /       |
  DSPDLYTM because they occur    duration was 248.13         /
  outside EXECTM.               (measured by ASMTAPES)      /
                                                           /
       |<--------------->|<-------------------------------/---------->|
       |                 |                                            |
       |                 |                                            |
       |CPITCBTM CPISRBTM|CPUTCBTM CPUSRBTM CPUHPTTM CPUIIPTM CPURCTTM|
       |   0.06    0.00  | 12.36     1.51     0.00     0.17     0.01  |
       |--------|--------|--------|--------|--------|--------|--------|


                             Figure 7.3
                              Schematic
                     Step Level Datetime Stamps
                   A long (full day) TSO session


                              SELAPSTM
    |-----------------------------------------------------------------|
    |                                                                 |
 INITTIME  ALOCTIME    LOADTIME                                 ENDTIME
    |---------|---------|---------------------------------------------|
     DSENQTM   ALOCTM  /                  EXECTM                      |
                      /                                               |
                     /                                                |
                    /                                                 |
                   /                                                  |
                  /                                                   |
                 /                    EXECTM                          |
                /                  14:29:37.42                        |
               |------------------------------------------------------|
               |                                                      |
               |    ACTDLYTM                   ACTIVETM               |
               | 14:16:57.90                     759.52               |
               |--------------|---------------------------------------|
                              |                                       |
                              | RESDLYTM                RESIDTM       |
                              |    04.22                 755.30       |
                              |------------|--------------------------|
       Session TGETS=1275                  |                          |
       TSO/MON TRANS=1276                  |DSPDLYTM             CPUTM|
                                           | 654.13             101.17|
                                           |--------------|-----------|

 (Additional schematics of TYPE30/STEPS data are in ACHAP30.)  (Schematics
of TYPE70 CPU relationships are in ACHAP10.)

A job consists of one or more steps.  Job initiation begins at JINITIME when
MVS receives the job from JES, and then MVS initiates the first step at
INITTIME.  The first step of every job begins by issuing the data set
enqueue (DSENQ) request for every data set name in all steps of the job.
(JES3 has already issued its enqueue request earlier, before initiation, so
it will always find its data set names available here).
But in JES2, the DSENQ request is made here for the first time.

If the data set name is not available for this job (because this request is
for DISP= OLD, NEW, or MOD, and the data set is being used, or when this
request is for DISP=SHR but some other job/user has the dataset name with
DISP=OLD,NEW, or MOD), this job enters "Data Set ENQUE Delay"
and the operator is notified of the conflict with a console message:
         "FOLLOWING DATA SETS RESERVED OR UNAVAILABLE."

If the operator takes no action, the job waits unproductively in execution,
tying up an initiator but using no other resources, until the job or TSO
session using the dataset terminates and frees the dataset.

If the operator cancels the job (actually, the operator must issue three JES
commands; $HJ to hold the job, $EJ to requeue the job, and then $CJ to
cancel the job), MVS initiation is terminated, and both a TYPE30_4 step
record and TYPE30_5 job record are written (both indicate SYSTEM ABEND 222),
and the job is returned to the JES2 execution queue in HOLD status until the
operator releases the job.  (More commonly, an automated operator program
detects the delay, automates the cancel, and sets up an ECB (Event Control
Block) against the dataset, and when the ECB is posted, the automated
operator issues the $AJ command to release the job from hold for restart.)

The duration between INITTIME and ALOCTIME is named DSENQTM because that is
the primary contributor to that duration.  By examining jobs with large
DSENQTM values, you can identify which jobs are not playing the DISP= game
correctly.  You can further examine the TYPE77 dataset to see which jobs
held and which jobs wanted which data sets, and then educate the users to
either use DISP=SHR, or to put the two contending jobs together as separate
steps of one job to avoid the delay

Once the data set names are acquired by DSENQ, step allocation commences at
ALOCTIME.  During allocation, all devices required by this step are acquired
one at a time, in the order of the DD statements for the step.
Allocation is the process that connects the DDNAME that your program will
use to the actual physical location of the dataset.

For an existing DASD data set, allocation finds the UCB address of the
volume, and reads the VTOC to find the cylinder, head, and record number of
the start of the data set.  If the request is for a new DASD data set, the
allocator selects a volume serial (here is where SMS ACS rules are executed
to choose which VOLSER is selected) and finds free space and creates the
VTOC entry for the new data set.  Thus the time to allocate a new DASD data
set is much longer than the time to allocate an existing DASD data set.

If a tape drive is needed and one is available, the allocator assigns a
specific tape drive address to your job.  If all tape drives that are
currently online are already in use by other jobs, the job enters
"Allocation Recovery", and the operator is notified by console message:
   UNABLE TO ALLOCATE 1 DRIVE.  THESE DRIVES ARE OFFLINE: 181, 182
   REPLY DEVICE NAME, WAIT, OR CANCEL
If one of the offline drives can be used, the operator can reply with the
address of that drive, and the allocator will vary the drive online and
assign it to the job, but that is not normally the case.  Usually, the
drives listed as offline are either broken or do not yet exist, so the
operator can either reply CANCEL (causing the job to terminate with a SYSTEM
ABEND 222, pissing off the user!), or, instead the operator will reply WAIT,
which causes the job to wait until a tape drive becomes available. When the
operator replies WAIT, he is further prompted to reply HOLD or NOHOLD - do
you want the job to hold the allocations that it has already acquired, or
should all of the devices that the job has allocated been freed, putting the
job back at the bottom of the queue.
Most sites reply HOLD.  The job then waits until a tape drive is free.

Allocation is complete when all needed devices have been given to the job.
Mounts for the first tape volume on each tape drive are issued at the end of
allocation, but allocation does not wait for tape mount completion.

The delay between ALOCTIME and LOADTIME is called ALOCTM, and it is used to
measure how long jobs, and which jobs, are delayed because they could not
get a tape drive when needed.  See the analysis in ACHAPzz.


When allocation for this step is complete, the program to be executed is
loaded into memory by "Program FETCH"; the datetimestamp of program load is
stored in variable LOADTIME.  Note that although the job has been in
initiation since JSTRTIME, and this step has been in initiation since
INITTIME, it is only now, after LOADTIME, that your program can execute its
instructions, use CPU time, and do I/O.  When this program completes its
work (or ABENDs!), the termination datetimestamp is stored in variable
TERMTIME.  If the job has additional steps, they INITTIME, ALOCTIME,
LOADTIME, and TERMTIME individually until all steps have run.
When the final step terminates, control is given back to MVS for job
termination, which is recorded in datetimestamp JTRMTIME.  There can be some
duration between the last step's TERMTIME and the job's JTRMTIME for
cataloging data sets as well as for DD consolidation in building the job's
TYPE30_5 from the step records (see the discussion in ACHAP3 under
DDCONS(NO)), but most of the time there is little delay between TERMTIME and
JTRMTIME.

INITTIME   datetimestamp of the initiation of a step.  From TYPE30_4.

DSENQTM    duration from INITTIME to ALOCTIME.  Measures any delay for
           data set enqueue in the first step, usually very small for
           subsequent steps.  From TYPE30_4.

ALOCTIME   datetimestamp of the beginning of MVS device allocation time.
           From TYPE30_4.

ALOCTM     duration from ALOCTIME to LOADTIME.  Measures length of time
           the job step took to allocate, including delay for drives.

LOADTIME   datetimestamp of the completion of allocation and beginning
           of problem program execution for this step.

TERMTIME   datetimestamp of MVS step termination.

EXECTM     duration from LOADTIME to TERMTIME, the elapsed duration
           during which the step could have executed its program's
           instructions.  It is during EXECTM that the step consumes
           CPU and performs I/O operations and occupies memory.

SELAPSTM   duration from INITTIME to TERMTIME, the elapsed duration of
           the step.  Step elapsed time is the sum of DSENQTM, ALOCTM,
           and EXECTM, which is how long the step owned its initiator.

ACTIVETM   duration when the SRM selected this task to be active.

ACTDLYTM   duration when the task was executing but was not active,
           calculated as  EXECTM minus ACTIVETM.  This duration includes
           time the address space swapped out due to Detected Wait or
           Long Wait Swaps (which includes think time for TSO users),
           and any time spent waiting for mount pending.

RESIDTM    duration, measured by the SRM, that the task was in real
           memory.

RESDLYTM   duration when the task was active but was not resident,
           calculated as ACTIVETM minus RESIDTM.  This delta is the time
           SRM wanted the ASID to be in the Multi-Programming Set, but
           was unable to get the ASID resident.  This includes all
           swapped out time (except DW/LW, which is caught in ACTDLYTM)
           plus the time to actually do the swap in.  Another name might
           be MPL Delay time, because all of the swap out and
           non-resident time while active is the time when the Target
           MPL was not high enough to include this task.  This delta
           divided by SWAPS is an estimate of the time-to-swap-in a
           task, but since SWAPS counts both RESDLYTM and ACTDLYTM swap
           events, and since RESDLYTM includes the time to swap in and
           the time waiting to swapin and the time swapped out, the
           estimate is usually poor!

DSPDLYTM   Duration when the task was resident but not dispatched,
           calculated as RESIDTM-CPUTCB+SRB+HPT+RCT+IPT.  This duration
           is the time the task was resident in memory but was not
           executing instructions.  It includes the delay for CPU
           dispatch (when you enter the MPL set you don't immediately
           get CPU - a hot CICS transaction may still have the
           processor, so you wait here until dispatched on a CPU), and
           the delay for I/O (you execute a few CPU instructions and
           then issue an I/O and you wait for IOS to get the data for
           you), and the delay due to page faults (the next page is not
           in memory and you wait for that page to be brought in).  Note
           that the two other step CPU times, CPITCBTM and CPISRBTM are
           not included in the subtraction, because that times occur
           before or after EXECTM.

CPUTM      sum of CPUTCBTM,CPUSRBTM,CPUIIPTM,CPUHPTTM,CPURCTTM,
           CPITCBTM, and CPISRBTM, the total CPU time consumed by this
           task, but note that DSPDLYTM only subtracts five of the seven
           CPU timings from RESIDTM.

JES Output Processor

After the job has executed (i.e., the JES execution phase has completed)
there may be JES output processing required, if the task created any SYSOUT
(printing or punching).  The JES Output Processor does not print or punch
anything; it is a brief phase from JPRNTIME to JFINTIME, during which JES
builds the print control blocks, called JOEs (job output element).  Each JOE
is the collection of all print files with the same destination and form, and
each JOE becomes a separate element in the JES print queue.  For each JOE
that is actually printed, there will be a separate type 6 record created,
containing PRINTIME, the datetimestamp when print lines were sent to the
printer device, and PRENTIME, the end of transmission of print lines to the
printer device.  The duration from JFINTIME to PRINTIME is PRQUETM, the
print queue time for that print file, while the duration from PRINTIME to
PRENTIME is PRINTTM, the duration when print lines were being sent to the
printer.  (If the print device is a Print Workstation (like a Xerox 9700),
which receives and buffers print lines for later printing, PRINTTM will not
represent the actual time for printing).  See ACHAPTzz for more details on
measuring printer delays, activity, and utilization.

JPRNTIME       datetimestamp of the start of the JES Output Processor.
               Uses the clock of whichever JES system executed the
               Output Process (SYSOUTP).  From TYPE26J2/TYPE26J3.

JFINTIME       datetimestamp of the end of the JES Output Processor.

PRINTIME       datetimestamp of the start of sending print lines to the
               printer device.  Uses the clock on the system to which
               the print device is connected (for a local printer), or
               the system to which the JES RJE line serving this printer
               is connected.  From TYPE6 dataset.

PRENTIME       datetimestamp of the end of sending print lines to the
               printer device.  For 3800-class printers, PRENTIME is not
               the end of printing, but instead it is the time when the
               last page of the job has been physically moved to the
               stacker page.

JES purge processor

When JES has completed all printing or punching for this task (or if the
task is purged with the $PURGE command), all traces of this job are removed
from the SPOOL and the JES job number is available for assignment to a new
job.  JES purge processor uses the clock on the CPU that purges the job.

JPURTIME       datetimestamp of the purge event.



The preceding variables provide an awesome opportunity for analysis.
Understanding the sequence of events and the points in that sequence that
are available in the operating system's task accounting records
significantly eases the CPE analyst's task.  However, it is necessary to
understand the limitations and potential errors that can occur in the time
stamps and durations.  There are numerous exposures in these
software-captured values from the CPU clock(s).

The multiple system environment probably causes the most confusion.  A
multiple operating system environment is often referred to as loosely
coupled to indicate that processors are coupled by the JES SPOOL rather than
tightly coupled or multiple engine processors that are coupled by sharing
one operating system and one CSTOR/ESTORE memory.  In such an environment,
it is possible for a job to be read in on one system, executed on another,
printed on a third, and so forth.  Only with IBM's SYSPLEX timer can these
multiple systems' clocks be synchronized.
Otherwise, each system's clock is set by human operators comparing
wristwatches and pressing the SET CLOCK switch (or its equivalent) on the
hardware.  If clocks for the two systems are not synchronized, jobs can
start before they were read in, execute in negative time, and so forth.
Where there is no hardware solution to synchronization, the operators must
understand the importance of accuracy when setting the system clock.  Some
sites have even implemented these circumventions:

   modify the JES START command (wrap it with a program) that compares
   the checkpoint timestamp with the oncoming system's clock when a new
   processor is brought into the complex.  Do not allow a processor to
   enter the complex if its clock is not within some predetermined
   value, usually 15 to 30 seconds.

   require operations supervisors to issue DT (Display Time) commands on
   each system and to record the delta time between the clocks as a part
   of the shift turnover procedures, and create competition between
   crews to see how closely they can set the clock.

   if an operator bonus program based on performance exists, negate the
   bonus if a crew fails to synchronize the clocks adequately


Restarted jobs in a JES2 environment also cause confusion.  For example,
when a job enters data set enqueue conflict, it is cancelled and placed on
hold until the job owning the data set finishes, at which time the job is
manually released by the operator and enters execution.  This is all one job
to JES2, but as both step and job termination records were written when the
job was cancelled, there appear to have been two executions in SMF.  There
is only one execution purge record, and the JES execution processor time
stamps JSTRTIME and JENDTIME contain the start and end time of the last
execution.

Although BUILDPDB resolves this possibility in its JOBS data set by counting
the number of job termination records (RESTARTS) and keeping additional
timestamps, many other accounting and performance systems falsely report the
count of job termination records as the number of jobs processed.

A final warning about these time stamps concerns how the operating system
builds them.  The images for some SMF records, especially type 26, are
initialized with hexadecimal zeros, and time stamps are filled in as events
occur.  If an event never occurs, then the time stamp remains hexadecimal
zeros in the written record.  By design, the  SAS
SMFSTAMP8 format recognizes all nulls and sets the variable's value to a
missing value.  If you then subtract the missing value variable from a
non-missing value variable, the result is now missing.

   READTIME is the only variable guaranteed to be nonmissing.

   JCNVTIME can be missing for a job placed on hold that is canceled
   before conversion.

   JSTRTIME is missing for all JCL errors detected in the reader or in
   the converter, an for any job canceled before execution.  You can
   identify that the job was cancelled by CANCELED='Y' in PDB.JOBS.

   JENDTIME is missing for any job with a missing JSTRTIME and, more
   importantly, for any job active when the operating system crashes.
   Furthermore, JENDTIME can be nonmissing and less than JSTRTIME if a
   job executed was restarted and the system crashed during the restart.

   JPRNTIME and JFINTIME can be missing for JCL errors.

   PRINTIME and PRENTIME are missing for any job that does not produce
   printed output.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: Life of a JOB

Reply via email to