Agreed we should generally look at the bigger picture, but sometimes seeing something in a hot spot that stalls the instruction pipeline can be worthwhile, and thinking down to CPU level occasionally can help us spot them faster.
I was just looking over a bit of a friend's code and saw 4 ICM used to reverse a little endian fullword. I guess he could use LRV, but it doesn't look like a hotspot to me :-) Roops --- "Mundus sine Caesaribus" On Sat, 23 Aug 2025, 17:06 Charles Mills, <charl...@mcn.org> wrote: > +1 > > I love these discussions. As an engineer I love trying to outguess the > hardware's cache algorithms. I love the idea of using two immediate > instructions to initialize a fullword rather than using a storage reference. > > But from a practical point of view, I could not agree more with @Colin. > Further, unless the code in question is executed millions of times per day > it will take years to get back the CPU time you devoted to the re-assembly. > And if you should introduce a bug, you are never going to get that > troubleshooting time back. > > Charles > > -----Original Message----- > From: IBM Mainframe Assembler List <ASSEMBLER-LIST@LISTSERV.UGA.EDU> On > Behalf Of Colin Paice > Sent: Saturday, August 23, 2025 8:34 AM > To: ASSEMBLER-LIST@LISTSERV.UGA.EDU > Subject: Execute-Type Instructions > > Few people should be looking at the instruction level that is being > discussed. > You are more likely to get performance benefits from higher up the stack. > For example, our code used an IMS bridge exit. Like all good programmers > we allocated storage for our usage, did the work and then freed the > storage. > This showed up as a hot spot 1) because of enqueue on the storage request > - and 2) the (small)cost of the storage requests. > We fixed this by passing in a block of storage for the routine to use - so > we allocated it once, and used it millions of times. > We had a global block with a pointer to the next free trace slot. The > code to update this involved compare and swap. With 8 CPU's there was a > lot of contention. We gave each TCB their own trace area and this > contention just disappeared. > One customer did the same DB2 SQL query in every CICS program - just to > check a system wide flag. It was pointed out this request was done > 10,000 times a second. They changed this checking a bit in a global > block, and deferred the need to upgrade their CPUs. > > So yes, look at the instruction level ... but do not forget what you are > trying to do. Remember there used to be discussions on *How many angels > can dance on the head of a pin?* > > Colin >