Scheme, Chicken, and Parrot

John Lenz Thu, 07 Jul 2005 23:39:45 -0700

Introduction
--------------------------------------------
I have a very alpha patch to add a parrot backend to the CHICKEN compiler.
 CHICKEN is a scheme to C compiler, and parrot is a continuation (rather
than stack) based virtual machine.  There are several design issues
mapping chicken constructs to parrot constructs (do the words "calling
conventions" come to mind? :) that need to get worked out.  Rather than
attach the patches to this email, I have stuck them on my web site at
http://www.cs.wisc.edu/~lenz/parrot


http://www.call-with-current-continuation.org
http://www.parrotcode.org

Background
--------------------------------------------
Chicken compiles scheme code into continuation-passing-style.  It is
heavily influenced by the paper
http://home.pipeline.com/~hbaker1/CheneyMTA.html  This is a good match to
Parrot's continuation passing call/return convention, but there are a few
problems.  Most of the problems stem from the fact that IMCC and PIR
assume that languages are exporting functions that act like traditional
functions... they call other functions, and then return.  Chicken exported
functions are interesting, in that they never return, and every function
call never returns; every function knows where it should go next and we
never come back to a function after a function call (unless the function
is referenced inside the continuation).

The advantages are that since both chicken and parrot use a continuation
passing style, they are a very good fit for each other.  Chicken generated
code never has to deal with a stack (the C backend needs to use longjmp()
since no function ever returns...).  Chicken generated code will take
advantage of the just in time compilation that parrot provides, as well as
all the libraries.  It also will allow scheme code to be called (and to
call) perl6, python, tcl, and the other languages that are being ported to
run on parrot.

Parrot Issues  (scroll down a ways for Chicken issues)
-------------------------------------------
1) Calling Conventions.  Chicken generates tons of little functions.  The
compiler (which is written in scheme and compiles itself) generates over
10,000 functions, each one on average 10-15 parrot instructions.  Because
of that, we can very efficiently manage the passing of arguments.  That
is, arguments for a call can be generated directly into registers.
For example, here is a sample function generated by chicken.  The function
accepts 3 arguments (P1, P2, P3) , and calls with two arguments (P1, P2).

# bar in k24 in k21 in k18
f_28:
set P1, P2
new P2, .FixedPMCArray # Closure
set P2, 2
set_addr I0, f_30
set P2[0], I0
new P4, .Ref
setref P4, P3
set P2[1], P4
jump P1

I have yet to add argument length checking, but you get the idea.  As you
can see, I am currently just using set_addr and jump so as to avoid the
massive overhead of passing arguments, when they are already in the right
registers.  Since scheme is dynamicly typed, everything is a PMC and we
don't really use the other registers.

It is fairly trivial to export a real sub that would use PPD3 that just
wraps up this call to marshal between the "internal" calling convention to
the PPD3.  While this would work fine, I would like to support these
calling conventions directly in parrot, so that the interpreter knows
which sub it is currently running, tracing, etc.

As such, I propose a fasttailcall opcode.  This opcode acts exactly like
the tailcall opcode, except it does not pass any arguments and leaves the
registers alone.
http://www.cs.wisc.edu/~lenz/parrot/fasttailcall.patch

One way to think about and justify the fasttailcall opcode is to think
about the difference between functions and chicken exported functions. 
For example, the scheme function
(define (myfunc a b)
  (let ((c (* a 6)))
    (print (+ a b c))
    (- a b)))
might generate 10 function-pieces.  Thus, what you would really like is
for the entire "myfunc" to be a function that is presented to the rest of
parrot... that is the only function that gets registered, it uses the new
PPD3 calling conventions, etc.  But internally, we split up the execution
of that function into 10 function-pieces.  Then we can fasttailcall
between function pieces, which really just means continue working on the
current function.  Thus, since you are still working on the same function,
leaving the registers and not invoking the entire PPD3 calling conventions
makes sense when switching between pieces.

2) PIR doesn't fit so well.  First off, we have already done all the
register allocation.  We never need to save registers across calls (since
nothing returns), and the only if-dependent code looks like
if I0, truecode
...setup args for false
fasttailcall
truecode:
..setup args for true
fasttailcall
There is no register allocation that needs to take place, except for
overflow.  Chicken generates code that assumes infinite registers.  In
practice, most functions are using less than 10 registers.  From what I
can see, the current register allocator in IMCC would die on this type of
input with all that set_addr, jump, etc.  This issue will probably go away
if parrot itself gets variable register frames... then IMCC register
allocation is basicly a no-op, espically when it sees the fasttailcall
opcode...

I'm not so sure about this point, it would need to be investigated more. 
The output currently can work with PASM, but I would like access to many
of the optimizations in IMCC.

Lastly, if we do use PIR, we would have the function-fragments outside of
actual .Sub decelerations... thus we would have code that looked like

f_50:
whatever, using register argument passing
fasttailcall

f_57:
aslkf
fasttailcall

...

.Sub myfunc
get_args ...
call f_50, passing P1 as the continuation.
.end

3) Closures
Chicken generates a lot of closures, you can see a closure generated in
the above example code. Chicken never actually uses what parrot calls a
continuation, since the generated code is completely continuation aware,
it just creates and passes around closures.  The first issue is that since
we are currently using set_addr and jump, we use a FixedPMCArray to
represent a closure.  Secondly, chicken is a lot smarter than "normal"
code, since it is continuation aware.  Thus, the closure PMC in parrot
does not fit correctly for us.  The closure PMC saves the entire local
variable stack, but chicken knows which locals will actually be referenced
in the function being closed over.  Thus, we represent a closure as an
array, where when creating the closure we set the elements in the array to
all the local variables that are needed, and then inside the function
access that array (since every function gets its own closure as the first
argument).

Thus, I propose a ClosureArray PMC type that ignores the pad_stack.  It
will inherit from both Sub and FixedPMCArray.  It stores the array in the
data section just like a FixedPMCArray, but you can call invoke on it. 
Since chicken generated code does not use the local variable support
provided by parrot, we would need to save locals for the express purpose
of closing over them...

The rational for including the ClosureArray PMC is that for smarter
compilers which manage locals themselves, using an array is much faster
than the storing every possible local that might be used on the stack.  I
started trying to work on this PMC, but don't understand the internals
well enough yet.  Otherwise, I will represent closures as a FixedPMCArray,
with a Sub as the first element in the array.

4) Standard PMCs.  Almost all scheme types are already covered by PMCs,
except for a few missing ones.  The first is the pair type.  We need a
standard pair type rather than the DictionaryEntry type that the current
pair is (See my other message on the PMCs: Should We Use Them).

The second is the scheme "EndOfList" and "EndOfFile" types.  These types
act exactly like Undef, except for the two functions (null?) and (eof?)
which check if the type is EndOfList or EndOfFile respectively.  Thus, for
scheme we only really need two dynclasses SchemeEndOfList and
SchemeEndOfFile.  These just derive from Undef, and act exactly like Undef
except for find_type.  Since they are so simple, we might consider making
them standard PMCs.
http://www.cs.wisc.edu/~lenz/parrot/dynclasses.patch

That should be it for the major parrot issues, now come the chicken issues!

Chicken Issues
-------------------------------------------
1) The code currently assumes that on a ##core#bind node, (first params)
is always 1 and subs contains two entries.  Supporting more than 1 bind at
a time makes the code a lot more complicated, and I could not see where
that will ever be more than 1.  If it is more than one, before generating
parrot code we might need to transform it to be two ##core#binds.

2) I also assume that ##core#setlocal always appears directly below a
##core#bind, and that there is only one ##core#setlocal under that bind. 
Thus, the only valid tree is (##core#bind (##core#setlocal expr) (class
expr)).  That is, ##core#setlocal always comes in a pair with a
##core#bind directly above it.  Again, if this isn't always the case, the
easiest way would be to transform the tree before (or as the first step)
when generating parrot code

3) I don't really understand the purpose of the c-platform.scm file... I
just copied c-platform.scm to parrot-platform.scm and it seems to work.

4) FFI does not currently work, but I believe it would be possible and
simple to add it.  Parrot supports what it calls the NCI interface, and we
can easily map FFI to NCI
 http://www.parrotcode.org/docs/pdd/pdd16_native_call.html

5) ##core#inline I am implementing in the parrot-inline.scm file... I
haven't started implementing them, but most of them can be just function
calls into the right parrot library (like all the hash table stuff, io
stuff, etc).  Most of the usages in library.scm are just fine, but if the
##core#inline parameter contains more than just the function name, this
won't work and will need to be changed... see point 6.

6) Some usages of foreign-lambda* won't really be portable between the C
and parrot backends.  Parrot does have a C interface, so for example the
foreign-lambda inside tinyclos.scm could be written in about the same
number of lines of code.  So chicken could export whatever it finds in
there, but the code that gets passed would need to use the parrot API
instead of the chicken API.  We need to look at and define some rules
about embedding c code (that uses the chicken API).  If the code uses just
normal C types, we can support the function just fine using NCI.  (I have
also been thinking about modifying SWIG to work with parrot... that would
be another option above and beyond NCI).

Well, that should be enough issues for one round :)  If most or all of
these get worked out, we can get a full scheme compiler running on parrot,
that can even compile itself!

John

Scheme, Chicken, and Parrot

Reply via email to