Our typical steps: 
Step 1 - detect that there is a WSC and find it - this seems to be an 
operational challenge. It typically ends up with an early morning call stating 
"the IPL didn't work" or some reference to whatever the last message on the 
console was before the IPL was attempted. Now that we have remote access to the 
HMC it has gotten easier as we're no longer dependent on the operator's 
interpretation of what's going on.
Step 2 - Correct the source of the problem
Step 3 - re-IPL
(repeat)

I can't remember the last time we actually took and used a SADUMP and most WSC 
issues are configuration issues and do not result in a PMR.

Some sort of indication on the NIP console to alert the operators to the WSC 
would be nice, assuming it got far enough to be able to use a console.

Thanks,
Bart 

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On Behalf Of 
John McDowell
Sent: Monday, May 14, 2012 12:56 PM
To: [email protected]
Subject: Early IPL problems

Thanks for all of the feedback thus far, both on and off the list, it has been 
useful.

So far, there seems to be general agreement that the frequency of problems is 
low (at least in production LPARs), that some way of displaying what the Wait 
State Code (WSC) means would be useful and that recovery from these sort of 
problems is accomplished by using another running z/OS system.

To help focus the feedback somewhat I would be interested to know:
- What are the typical steps taken to identify/recover from a WSC ?   For 
example, 
   Step 1: Find the meaning of the WSC  
   Step 2: SADUMP
   Step 3: Correct the source of the problem
   Step 4: Re-IPL
   Step 5: Open a PMR
   etc., etc.
   Steps 1 seems fairly certain, beyond that the remaining steps (both content 
and sequence) seem to be much less so.    
- Are the existing messages useful/sufficient for identifying the underlying 
cause and how to correct the WSC ?
- Are there circumstances (e.g. console setup, etc.) that prevent the existing 
messages from being seen ?
- Are additional/better/different messages needed to identify and/or correct 
the cause of the WSC ?
- Is there any need to diagnose and/or correct problems without resorting to 
another running z/OS system ?
- To what extent, if any, is it desirable to avoid the WSC by providing some 
means of error recovery ? 

Thanks.

John McDowell - IBM

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to