Re: Parrot Forth 0.1

Michel Pelletier Sun, 17 Oct 2004 20:32:00 -0700

> Parrot Forth
>
>     Released: 14 October 2004
>     Version:  0.1
>     Download:
> http://matt.diephouse.com/software/parrot-forth-0.1.tar.gz
>
> This is the initial release of my
> re-implementation of Parrot Forth in
> PIR. Code reviews are both welcome and
> appreciated (PIR is kind of
> new, so I may not be doing everything
> correctly). The two main goals
> of this implementation are: to be a complete
> example of how to
> implement a Parrot compiler and to be as clean
> and readable as
> possible.


This is my first chance to take a look at it but
I'm sorry I've nto been able to run it because
I'm on a different machine.  I did look at the
code though.

It's no suprise that I too am writing a forth
like language in PIR, but we have gone in some
different directions.  Once again I might be
wrong about the below because I didn't get a
good chance to analyze it.   There ar some
differences, like I keep the stack in a
register, you keep yours in a global, and you
store your core words in an "operations" global,
and I use  a Parrot lexical pad.

I have no idea if storng something in a register
is worse than a global.  I have certainly had
problems with IMCC stomping my well known
registers with $P? temp vars.  Lexical pads vs
keeping your own hash are probably equivalent,
but perhaps pads have some cross-language
benefit.

This is a good opertunity thought to discuss
some of the things that I have been doing with
Parakeet.  I have completely re-written the
parakeet core to be only a simple eval compile
loop that unerstand only one word pattern:

code <name>
  ... PIR ...
next

This is, of course, standard Forth.  With this
micro kernel I can then bootstrap the remaining
Parakeet wordset from an outside file.  A
Standard Forth compliant wordset could be
bootstrapped just as easily.

My problem now is that even this minimal
Parakeet micro-kernel, parser, numeric
converter, is still pretty Parakeet specific. 
Where the machine state goes (stack, IP (if you
have one), etc) and where names are kept (Forth
would use a "dictionary", Parakeet uses nested
lexical scope).

I propose you and I work together to make a
totally Forth-language agnostic Forth
micro-kernel.  This kernel can be very
minimalistic, a stacik, a machine state hash,
and definitions for the words "code", "next",
"word", and "'" (tick) all having standard Forth
behavior, a simple dictionary and a simple eval
loop.

The micro kernel then just goes through some
common stuff (parse args, load files, init
blocks, etc.) and then bootstrap the larger
language from some optionally specified language
file.

At this point, a standard Forth bootstrap file
would contain things like:

code __init_forth
  # initialize more specific state for the language
  P30 = new .PerlArray  # memory buffer
  P29 = new .PerlArray  # data stacik
  P28 = new .PerlArray  # return stack
  # ... or whatever
next

__init_forth

code @
  # PIR definition for fetch
next

code !
  # PIR definition for store
next

...

etc.  In the init word, you bootstrap the
machine/language spacific stuff that you intend
to use in later words.  I use these constant P
vars as an example here, your idea of using
globals is better.

Parakeet can go its other direction defining a
completely different language, but based on some
shared code that we can both maintain.

Some Parakeet ideas might also be used in your
code, for example, it looks to me like your code
does direct threading:

NOT_DOT_QUOTE:
    # check to see if it's an operation
    $P0 = find_global "operations"
    $S0 = downcase word
    $I0 = exists $P0[$S0]
    if $I0 != 1 goto NOT_AN_OP
    pir_code .= "$P0 = ops['"
    pir_code .= $S0
    pir_code .= "']\n"
    pir_code .= "$P0()\n"
    goto RETURN

Direct threading is a common Forth
implementation technique, but it was most often
used because it could be implemented portably in
C with only a small bit of asm.  For smaller ops
like @ !, math ops, amd many others, it is more
optimal to use direct code generation to
"inline" the PIR code itself instead of
inlineing an invoke to the PIR code compiled as
as sub.

Sor the word:

: square dup * ;

in the direct thread case would become (in
pseudo PIR):

.sub blah
  $P0 = ops["dup"]
  $P0()
  $P0 = ops["*"]
  $P0()
  .NEXT
.end

but using native code generation would ineline
the definitions for dup and mul:

.sub ncg
  .POP
  .NOS = .TOS
  .TOS = .TOS * .NOS
  .PUSH
   .NEXT
.end

resulting in a lot less overhead for core words.
 NCG was usually either a commercial feature or
rarely seen in Forth because it was
non-portable, being written in ASM, and
expensive to maintain and multiple platforms. 
We can kick that problem to the door.

Do you think this is a good idea?  I can certain
help along with implementing words that Parakeet
and Forth share.  I myself have never
implemented a complete Forth, but I've given a
couple a stab, both direct thread models.

-Michel

Re: Parrot Forth 0.1

Reply via email to