> According to an old 486 book I have, it claims > that complex addressing modes don't have cycle > penalties for leaving out the scale or the offset. > That seems hard to believe for the RISC-like > P3s and Athalons.
x4 is just a bit offset, so it shouldn't be hard to believe that the pentium+ micro-ops can handle this just as efficiently (it's just setup overhead). > > What about other processors? Is it common to > have address modes like: > > base+offset*scale I'm not sure, but I thought sparc had a special 3-way add some where. In any case with pipelines as deep as they are these days, I'm doubting that your quest to achieve 100 mops for a 3 instruction loop is going to have much impact on more practical code (of several hundred or thousand instructions per loop). Unless you're willing to write c-code compilers differently for different architectures I doubt you're going to find a universal performance tweak. And I'd definately cringe at the idea of complexifying code for special purposes this early in the game. > > Most RISC instruction sets only provide base + > constant offset don't they? ALPHA and I think SUN provide instructions that are 4, 8 and 16 byte multiples of one register. Again, it's just a simple trick played out in the decode phase. -Michael