On Jan 12, 10:42 am, quans...@quanstro.net (erik quanstrom) wrote: > > [...] Many architectures get register > > windows wrong, but the Itanium has a variable-length register fill/ > > spill engine that gets invoked automatically. Of course, you can > > program the engine too. > > what's the advantage of this over the stanford style?
I'm not sure what exactly you mean by that. > > >I also REALLY like predicated instructions. > > like arm? ARM is fine, but itanium predicated instructions allow you to have a great number of predicate registers. This isn't like cmov and friends either. > > That is, you perform an operation and then predicate the instructions > > that should execute if it comes out the way you want. It really > > simplifies assembly-level if/then and switch-style blocks. > > unless it's an 8- or 16-bit part, i don't see why anyone cares > if the assembly is simplier. but since this is an epic part, > the assembly is never simple. I don't know why bit size matters. Anyway, making the assembly simpler has a lot of benefits. A human has to write the stuff at some point. When there are bugs, a human has to read it. It also simplifies code generation by the compiler. > how do you get around the fact that the parallelism > is limited by the instruction set and the fact that one > slow sub-instruction could stall the whole instruction? Parallelism isn't anymore limited by the instruction set on Itanium than it is anywhere else. The processor has multiple issue units that can crunch multiple instructions in parallel. Some units can execute multiple instructions per cycle. > > The hardware also has built-in support for closures. Every function > > executed is implicitly paired with a given local memory region. > > what's the difference between this and stack? There is a massive difference. As the other poster pointed out, closures are cool in and of themselves. On x86 processors, you get 4 stacks. One for each privilege level. You can change a stack anytime you want, but it requires either an instruction to do so, or instruction patching by the loader. Everything gets stuck there and there are very few restrictions about what you do with stuff on the stack. On Itanium you have two kinds of stacks AND a global pointer for local memory accesses. One kind of stack is much like what you are used to. The other kind of stack is ONLY for the register spill/fill engine and cannot be programmatically accessed while it's in use. Which means that you can't smash the stack and have the function return to an arbitrary location. The global pointer is for indirect memory accesses, and allows you to do all sorts of interesting things. From .dll to simplified thread-local storage. > > There is a *lot* to like about Itanium. > > there's a lot not to like about itanium. epic means that > instructions need to be hand-crufted. in itanium land, you > schedule instructions. in x86-64 land, instructions > schedule you. > > what's to like about that? Quite a bit. Having the processor scan the incoming instruction stream to locate potential parallizations is ludicrous. It works fine when the processor guesses correctly, but it is horrendously expensive when the processor guesses wrong. Requiring that the processor scan incoming instructions to suss out potential parallelizations also means that much less die space for doing real work. Finally, the processor has almost NO context about the instructions. A compiler has immensely more context and can do a much better job indicating which instructions can execute in parallel. IA64 got a bad rap because the first hardware implementations of IA64 were less than stellar, and the compilers were harder to write than expected. The Itanium-2 and modern compilers are actually quite nice. -={C}=-