Interesting. And no I didn’t know about it. Question: At what stage does the ESQA into ECSA check wave a red (or orange) flag?
Cheers, Martin Sent from my iPad > On 3 Oct 2019, at 22:34, Mark Zelden <m...@mzelden.com> wrote: > > On Thu, 3 Oct 2019 16:18:32 -0500, Mark Zelden <m...@mzelden.com> wrote: > >> If you are running z/OS 2.3 and increasing ESQA because of expansion into ECSA messages >> or sudden unexplained growth, check out APAR OA58438. >> >> We had 3 system crashes after migrations to z/OS 2.3 in 2019 and one close call after >> ECSA got to 99% when ESQA expanded into it (only a vendor monitor crashed in that >> case after a failed ECSA getmain). Stand alone dumps didn't find the root cause >> other than we new it was RPB pool growth related to SVC dumps from CICS. In one >> case a single SVC dump caused an 80M ESQA spike within one or two seconds crashed >> a system when it spilled into ECSA and also filled up ECSA (typically at about >> 70% use, but "stable"). >> >> We worked with IBM all summer on this. We had different SLIPs and GTF traces put in >> place, but with the traces going the problem never happen. But SVC dump processing >> did take over the CPU with the trace + GTF active! :-) >> >> Meanwhile, we increased ESQA on 30 LPARs via normal IPLs over the summer by about >> 80M and ECSA a bit as a "work around". Settings that haven't been touched in god knows >> how long (certainly not since 64-bit usage has increased and HVCOMMON). So we had >> to loose about 100M of high private to do this. We also increased real storage on a >> couple of LPARs that really didn't warrant it (based on zero or close to zero demand >> paging during normal operations), but we knew real storage was also involved in >> the problem (no flash memory for SVC dumps on my client's mainframes). >> >> The entire time IBM has said we are the only ones reporting the problem, but since we >> had the problem in big sysplexes, small sysplexes, big LPARs, small LPARs, I know that >> we can't be the only ones. I think other shops are ignoring the ESQA expansion into >> ECSA (since that in itself doesn't hurt) and / or they have more "white space". The >> RPB control blocks are freed after about 10 minutes, so anyone looking at their >> current ESQA (and ECSA) usage wouldn't notice the spikes or would just say 'oh well, >> looks good now". >> >> Anyway, IBM was getting close to figuring this out not too long ago and partially >> re-created the problem in the lab some weeks ago and just got back to us today >> with the root cause and the APAR that was opened. It is related to being real >> storage constrained at the time of the SVC dumps (I think all of the crashes were >> during CICS startup time in the wee morning hours). >> >> I really wanted to post something about this earlier but didn't since IBM said >> they had no other reported problems, So if you have seen this problem since >> migrating to z/OS 2.3, now you know you aren't the only ones. >> > > > One thing I didn’t mention in the post (well, I did the first time I started to compose it, > but accidentally closed the window) is that one may not even notice the problem because > the RPB pools are releases after some period of time (10-15 minutes?). So if one looked > at any given point in time ESQA usage would be “normal”. The only clue would be Health > Checker messages if Health Checker was running or some other monitor tripping > an ESQA threshold hit or expansion into ECSA. > > > Regards, > > Mark > -- > Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS > ITIL v3 Foundation Certified > mailto:m...@mzelden.com > Mark's MVS Utilities: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mzelden.com_mvsutil.html&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=BsPGKdq7-Vl8MW2-WOWZjlZ0NwmcFSpQCLphNznBSDQ&m=43S2JiirvFMh4tAR2Cl0rV7xrOMLxgWPfM46UlAhBvY&s=NjRA6txmJh5iEmctzIV1vmrUwRbMU9MrRo8bzD7-aq4&e= > > ---------------------------------------------------------------------- > For IBM-MAIN subscribe / signoff / archive access instructions, > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN >Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN