On Wed, Jul 20, 2016 at 09:02:41PM +0200, Liam Proven wrote: > On 19 July 2016 at 17:04, Peter Corlett <ab...@cabal.org.uk> wrote: [...] >> RISC implies a load-store architecture, so that claim is redundant. > Could you expand on that, please? I think that IKWYM but I'm not sure.
A load-store architecture is one where the ALU only operates on registers. The name comes from having separate instructions to load registers from memory, and store them to memory. The converse is register-memory, where ALU instructions can work directly on memory. However, this means that the instructions have to do quite a lot of work because now data has to be brought in from memory to an anonymous register to be worked on and then stored back to the same location. This also results in a proliferation of instruction and addressing mode combinations. Sounds rather CISCy, doesn't it? Meanwhile, a load-store architecture would have to decompose that into simpler independent load, operate, store instructions. Hey presto, RISC! [...] >> IMO, it's the predicated instructions that is ARM's special sauce and the >> real innovation that gives it a performance boost. Without those, it'd be >> just a 32 bit wide 6502 knockoff. > Do tell...? You've already answered the "6502 knockoff" elsethread, so I assume you're asking about the predicated instructions. A predicated instruction is one that does or does not execute based on some condition. CISC machines generally use condition codes (aka flags), and only have predicated branch instructions. Branch-not-equal, that kind of things. In ARM, *all* instructions can be predicated. Because instructions are 32 bits wide, it has the luxury of allocating four bits to select from one of 16 possible predicates based on the CPU flags. One predicate is "always" so one can also unconditionally execute instructions. An occasionally forgotten feature is that ALU operations also have a S-bit to indicate whether they should update the flags based on the result, or leave them alone. Between these, a conditional branch over a handful of instructions can be replaced by making those instructions predicated, and the S bit set to not update the flags. Not only has the conditional branch been deleted completely from the instruction stream which makes code noticably more compact, but there's now no branch-induced pipeline stall. Specila sauce. Unsurprisingly, x86 eventually noticed this sort of thing is useful and pinched the idea, but did it in the usual half-arsed fashion that it is famous for.