On Wed, 9 Dec 2020 at 10:53, Charles Mills <[email protected]> wrote: > > I have been thinking about this. It is a daunting project. I once set out to > develop a simple list of opcodes with their required minimum hardware level. > I wanted to be able to answer questions of the form "management wants this > product to be able to run on a z9. Can I use AHI?" In fact I think I asked > for help on this list, and promised to share any results. I abandoned the > effort! I decided it was easier just to code AHI and assemble with ZS-5 or > whatever and see if I got an error. That list is obviously just one > component of your product.
There has been a list at http://www.tachyonsoft.com/inst390o.htm for many years. I'm not at all sure it's maintained these days, but I use it frequently as a quick reference. On the more general matter of analysing a load module/PO/UNIX file and classifying the opcodes found therein, I think this is pretty much an AI project. Surely it engages the Halting Problem for the general case. Given sufficient non-code (data) space within the module, it is very likely that some of the data will be interpreted as instructions, and notably many of the instructions having their first byte in the range of EBCDIC letters are from the newer architecture levels. There are some other problems. In a few cases the behaviour of existing instructions has been expanded with an architectural level. Notably the Long Displacement Facility added use of a previously unused byte in around 50 existing instructions to go from a 12-bit unsigned to a 20-bit signed displacement. This isn't insurmountable; finding just one such instruction with a non-zero DH field means that that facility is required. And this facility is pretty old now (2003), so unlikely to not be required. The Interlocked-Access Facility 1 (2010) and 2 (2012) each turned some existing instructions into interlocked-update ones. This included old standbyes like NI, OI, and XI. There is no way to tell from the instruction itself if the program is relying on the new behaviour. The ETF2-Enhancement Facility (2005) added meaning to a bit in the existing TRxx instructions (TROO, TROT, TRTO, and TRTT).. And there are very CISC facilities like the Message-Security-Assists that are more like library calls that support many subfunctions than like even the previous most complex instructions. A program can issue a query to find out if a function is supported, or just assume that it is, so analysis of such code would require data flow analysis or finding a hard-coded function that was introduced at a particular level. And then there is the EXecute instruction (actually two of them now). Typically this is used to put the length into an MVC or CLC or the like. Occasionally it has been used to put the SVC number into an SVC or the mask into a TM. But given that part of the opcode in some instruction formats is in the second byte, it's quite possible to change the target instruction itself. So e.g. if you EX a STCK (B205) with the execute register containing X'40', you will instead execute a SQDR (Square Root short HFP) instruction (B245). This would be fine material for an obfuscated programming contest, and is of course unlikely to be found in real-life code. But who knows - maybe code obfuscators (human or software) have used it to "protect" their code. Tony H. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
