> On Jul 7, 2017, at 12:28 PM, Walt Farrell <[email protected]> wrote:
> 
> On Fri, 7 Jul 2017 09:59:01 -0500, Donald Likens <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>> We run our product on somewhere around 15 client site. We have had this 
>> problem on one client site (multiple LPARs) since we started running there. 
>> We have only seen this problem on one other system during a trial (they did 
>> not keep the product). The client site that sees this problem sees it 
>> randomly but multiple times during a week. The code for where I think the 
>> problem exists is in the original listserv entry. What we are doing is using 
>> the IEFU8X (3,4,5) exits to extract SMF records and place them in a FIFO 
>> control block chain in CSA to be processed and removed by an STC. I created 
>> a batch procedure that reads SMF records and passed them to the active 
>> IEFU8x exit. We ran this batch process on our test system multiple times 
>> duplicating the client’s environment using SMF data supplied from 1 hour 
>> before and 1 hour after the problem showed up at the client’s site. It feed 
>> 297,092 records into our product in 6 minutes and IT WORKED PERFECTLY every 
>> time.
>> 
>> Does anyone have any ideas on what could cause this situation? 
> You say "duplicating the client's environment." 
> 
> Does that duplication include number of CPs and processor speed? I have seen 
> test environments that were setup as single CP, which does little to help 
> debug problems related to multi-tasking or -processing.
> 
> How many simultaneous copies of your batch process were running, given that 
> the client could have dozens or hundreds of address spaces writing SMF 
> records? How much multi-tasking within your batch process did you have, in 
> case the issue is related to multi-tasking rather than multi-processing?
> 
> How did each of your batch processes pass the records to the exit? And which 
> exit(s) did you use?
> 
> -- 
> Walt

Walt,
Exactly…. We had early serial numbers on a 168MP. We were using DFSORT at the 
time. We were seeing issues with DFSORT, yet nobody else was seeing what we 
were seeing.
We sent a couple of dumps off to IBM and they scratched their heads for a week 
or two. I called them up and was talking over the matter and I asked if this 
could be a MP issue with serialization,
I heard the guy say “interesting”. Apparently IBM DFSORT didn’t have access to 
a MP to test their code on. About 3 or 4 days later I got a call saying they 
had found the fix and they sent it overnight, I put the fix on the next day and 
the issue disappeared.

Ed
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to