> On Jul 7, 2017, at 12:28 PM, Walt Farrell <[email protected]> wrote: > > On Fri, 7 Jul 2017 09:59:01 -0500, Donald Likens <[email protected] > <mailto:[email protected]>> wrote: > >> We run our product on somewhere around 15 client site. We have had this >> problem on one client site (multiple LPARs) since we started running there. >> We have only seen this problem on one other system during a trial (they did >> not keep the product). The client site that sees this problem sees it >> randomly but multiple times during a week. The code for where I think the >> problem exists is in the original listserv entry. What we are doing is using >> the IEFU8X (3,4,5) exits to extract SMF records and place them in a FIFO >> control block chain in CSA to be processed and removed by an STC. I created >> a batch procedure that reads SMF records and passed them to the active >> IEFU8x exit. We ran this batch process on our test system multiple times >> duplicating the client’s environment using SMF data supplied from 1 hour >> before and 1 hour after the problem showed up at the client’s site. It feed >> 297,092 records into our product in 6 minutes and IT WORKED PERFECTLY every >> time. >> >> Does anyone have any ideas on what could cause this situation? > You say "duplicating the client's environment." > > Does that duplication include number of CPs and processor speed? I have seen > test environments that were setup as single CP, which does little to help > debug problems related to multi-tasking or -processing. > > How many simultaneous copies of your batch process were running, given that > the client could have dozens or hundreds of address spaces writing SMF > records? How much multi-tasking within your batch process did you have, in > case the issue is related to multi-tasking rather than multi-processing? > > How did each of your batch processes pass the records to the exit? And which > exit(s) did you use? > > -- > Walt
Walt, Exactly…. We had early serial numbers on a 168MP. We were using DFSORT at the time. We were seeing issues with DFSORT, yet nobody else was seeing what we were seeing. We sent a couple of dumps off to IBM and they scratched their heads for a week or two. I called them up and was talking over the matter and I asked if this could be a MP issue with serialization, I heard the guy say “interesting”. Apparently IBM DFSORT didn’t have access to a MP to test their code on. About 3 or 4 days later I got a call saying they had found the fix and they sent it overnight, I put the fix on the next day and the issue disappeared. Ed ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
