> Parrot Forth > > Released: 14 October 2004 > Version: 0.1 > Download: > http://matt.diephouse.com/software/parrot-forth-0.1.tar.gz > > This is the initial release of my > re-implementation of Parrot Forth in > PIR. Code reviews are both welcome and > appreciated (PIR is kind of > new, so I may not be doing everything > correctly). The two main goals > of this implementation are: to be a complete > example of how to > implement a Parrot compiler and to be as clean > and readable as > possible.
This is my first chance to take a look at it but I'm sorry I've nto been able to run it because I'm on a different machine. I did look at the code though. It's no suprise that I too am writing a forth like language in PIR, but we have gone in some different directions. Once again I might be wrong about the below because I didn't get a good chance to analyze it. There ar some differences, like I keep the stack in a register, you keep yours in a global, and you store your core words in an "operations" global, and I use a Parrot lexical pad. I have no idea if storng something in a register is worse than a global. I have certainly had problems with IMCC stomping my well known registers with $P? temp vars. Lexical pads vs keeping your own hash are probably equivalent, but perhaps pads have some cross-language benefit. This is a good opertunity thought to discuss some of the things that I have been doing with Parakeet. I have completely re-written the parakeet core to be only a simple eval compile loop that unerstand only one word pattern: code <name> ... PIR ... next This is, of course, standard Forth. With this micro kernel I can then bootstrap the remaining Parakeet wordset from an outside file. A Standard Forth compliant wordset could be bootstrapped just as easily. My problem now is that even this minimal Parakeet micro-kernel, parser, numeric converter, is still pretty Parakeet specific. Where the machine state goes (stack, IP (if you have one), etc) and where names are kept (Forth would use a "dictionary", Parakeet uses nested lexical scope). I propose you and I work together to make a totally Forth-language agnostic Forth micro-kernel. This kernel can be very minimalistic, a stacik, a machine state hash, and definitions for the words "code", "next", "word", and "'" (tick) all having standard Forth behavior, a simple dictionary and a simple eval loop. The micro kernel then just goes through some common stuff (parse args, load files, init blocks, etc.) and then bootstrap the larger language from some optionally specified language file. At this point, a standard Forth bootstrap file would contain things like: code __init_forth # initialize more specific state for the language P30 = new .PerlArray # memory buffer P29 = new .PerlArray # data stacik P28 = new .PerlArray # return stack # ... or whatever next __init_forth code @ # PIR definition for fetch next code ! # PIR definition for store next ... etc. In the init word, you bootstrap the machine/language spacific stuff that you intend to use in later words. I use these constant P vars as an example here, your idea of using globals is better. Parakeet can go its other direction defining a completely different language, but based on some shared code that we can both maintain. Some Parakeet ideas might also be used in your code, for example, it looks to me like your code does direct threading: NOT_DOT_QUOTE: # check to see if it's an operation $P0 = find_global "operations" $S0 = downcase word $I0 = exists $P0[$S0] if $I0 != 1 goto NOT_AN_OP pir_code .= "$P0 = ops['" pir_code .= $S0 pir_code .= "']\n" pir_code .= "$P0()\n" goto RETURN Direct threading is a common Forth implementation technique, but it was most often used because it could be implemented portably in C with only a small bit of asm. For smaller ops like @ !, math ops, amd many others, it is more optimal to use direct code generation to "inline" the PIR code itself instead of inlineing an invoke to the PIR code compiled as as sub. Sor the word: : square dup * ; in the direct thread case would become (in pseudo PIR): .sub blah $P0 = ops["dup"] $P0() $P0 = ops["*"] $P0() .NEXT .end but using native code generation would ineline the definitions for dup and mul: .sub ncg .POP .NOS = .TOS .TOS = .TOS * .NOS .PUSH .NEXT .end resulting in a lot less overhead for core words. NCG was usually either a commercial feature or rarely seen in Forth because it was non-portable, being written in ASM, and expensive to maintain and multiple platforms. We can kick that problem to the door. Do you think this is a good idea? I can certain help along with implementing words that Parakeet and Forth share. I myself have never implemented a complete Forth, but I've given a couple a stab, both direct thread models. -Michel