Introduction -------------------------------------------- I have a very alpha patch to add a parrot backend to the CHICKEN compiler. CHICKEN is a scheme to C compiler, and parrot is a continuation (rather than stack) based virtual machine. There are several design issues mapping chicken constructs to parrot constructs (do the words "calling conventions" come to mind? :) that need to get worked out. Rather than attach the patches to this email, I have stuck them on my web site at http://www.cs.wisc.edu/~lenz/parrot
http://www.call-with-current-continuation.org http://www.parrotcode.org Background -------------------------------------------- Chicken compiles scheme code into continuation-passing-style. It is heavily influenced by the paper http://home.pipeline.com/~hbaker1/CheneyMTA.html This is a good match to Parrot's continuation passing call/return convention, but there are a few problems. Most of the problems stem from the fact that IMCC and PIR assume that languages are exporting functions that act like traditional functions... they call other functions, and then return. Chicken exported functions are interesting, in that they never return, and every function call never returns; every function knows where it should go next and we never come back to a function after a function call (unless the function is referenced inside the continuation). The advantages are that since both chicken and parrot use a continuation passing style, they are a very good fit for each other. Chicken generated code never has to deal with a stack (the C backend needs to use longjmp() since no function ever returns...). Chicken generated code will take advantage of the just in time compilation that parrot provides, as well as all the libraries. It also will allow scheme code to be called (and to call) perl6, python, tcl, and the other languages that are being ported to run on parrot. Parrot Issues (scroll down a ways for Chicken issues) ------------------------------------------- 1) Calling Conventions. Chicken generates tons of little functions. The compiler (which is written in scheme and compiles itself) generates over 10,000 functions, each one on average 10-15 parrot instructions. Because of that, we can very efficiently manage the passing of arguments. That is, arguments for a call can be generated directly into registers. For example, here is a sample function generated by chicken. The function accepts 3 arguments (P1, P2, P3) , and calls with two arguments (P1, P2). # bar in k24 in k21 in k18 f_28: set P1, P2 new P2, .FixedPMCArray # Closure set P2, 2 set_addr I0, f_30 set P2[0], I0 new P4, .Ref setref P4, P3 set P2[1], P4 jump P1 I have yet to add argument length checking, but you get the idea. As you can see, I am currently just using set_addr and jump so as to avoid the massive overhead of passing arguments, when they are already in the right registers. Since scheme is dynamicly typed, everything is a PMC and we don't really use the other registers. It is fairly trivial to export a real sub that would use PPD3 that just wraps up this call to marshal between the "internal" calling convention to the PPD3. While this would work fine, I would like to support these calling conventions directly in parrot, so that the interpreter knows which sub it is currently running, tracing, etc. As such, I propose a fasttailcall opcode. This opcode acts exactly like the tailcall opcode, except it does not pass any arguments and leaves the registers alone. http://www.cs.wisc.edu/~lenz/parrot/fasttailcall.patch One way to think about and justify the fasttailcall opcode is to think about the difference between functions and chicken exported functions. For example, the scheme function (define (myfunc a b) (let ((c (* a 6))) (print (+ a b c)) (- a b))) might generate 10 function-pieces. Thus, what you would really like is for the entire "myfunc" to be a function that is presented to the rest of parrot... that is the only function that gets registered, it uses the new PPD3 calling conventions, etc. But internally, we split up the execution of that function into 10 function-pieces. Then we can fasttailcall between function pieces, which really just means continue working on the current function. Thus, since you are still working on the same function, leaving the registers and not invoking the entire PPD3 calling conventions makes sense when switching between pieces. 2) PIR doesn't fit so well. First off, we have already done all the register allocation. We never need to save registers across calls (since nothing returns), and the only if-dependent code looks like if I0, truecode ...setup args for false fasttailcall truecode: ..setup args for true fasttailcall There is no register allocation that needs to take place, except for overflow. Chicken generates code that assumes infinite registers. In practice, most functions are using less than 10 registers. From what I can see, the current register allocator in IMCC would die on this type of input with all that set_addr, jump, etc. This issue will probably go away if parrot itself gets variable register frames... then IMCC register allocation is basicly a no-op, espically when it sees the fasttailcall opcode... I'm not so sure about this point, it would need to be investigated more. The output currently can work with PASM, but I would like access to many of the optimizations in IMCC. Lastly, if we do use PIR, we would have the function-fragments outside of actual .Sub decelerations... thus we would have code that looked like f_50: whatever, using register argument passing fasttailcall f_57: aslkf fasttailcall ... .Sub myfunc get_args ... call f_50, passing P1 as the continuation. .end 3) Closures Chicken generates a lot of closures, you can see a closure generated in the above example code. Chicken never actually uses what parrot calls a continuation, since the generated code is completely continuation aware, it just creates and passes around closures. The first issue is that since we are currently using set_addr and jump, we use a FixedPMCArray to represent a closure. Secondly, chicken is a lot smarter than "normal" code, since it is continuation aware. Thus, the closure PMC in parrot does not fit correctly for us. The closure PMC saves the entire local variable stack, but chicken knows which locals will actually be referenced in the function being closed over. Thus, we represent a closure as an array, where when creating the closure we set the elements in the array to all the local variables that are needed, and then inside the function access that array (since every function gets its own closure as the first argument). Thus, I propose a ClosureArray PMC type that ignores the pad_stack. It will inherit from both Sub and FixedPMCArray. It stores the array in the data section just like a FixedPMCArray, but you can call invoke on it. Since chicken generated code does not use the local variable support provided by parrot, we would need to save locals for the express purpose of closing over them... The rational for including the ClosureArray PMC is that for smarter compilers which manage locals themselves, using an array is much faster than the storing every possible local that might be used on the stack. I started trying to work on this PMC, but don't understand the internals well enough yet. Otherwise, I will represent closures as a FixedPMCArray, with a Sub as the first element in the array. 4) Standard PMCs. Almost all scheme types are already covered by PMCs, except for a few missing ones. The first is the pair type. We need a standard pair type rather than the DictionaryEntry type that the current pair is (See my other message on the PMCs: Should We Use Them). The second is the scheme "EndOfList" and "EndOfFile" types. These types act exactly like Undef, except for the two functions (null?) and (eof?) which check if the type is EndOfList or EndOfFile respectively. Thus, for scheme we only really need two dynclasses SchemeEndOfList and SchemeEndOfFile. These just derive from Undef, and act exactly like Undef except for find_type. Since they are so simple, we might consider making them standard PMCs. http://www.cs.wisc.edu/~lenz/parrot/dynclasses.patch That should be it for the major parrot issues, now come the chicken issues! Chicken Issues ------------------------------------------- 1) The code currently assumes that on a ##core#bind node, (first params) is always 1 and subs contains two entries. Supporting more than 1 bind at a time makes the code a lot more complicated, and I could not see where that will ever be more than 1. If it is more than one, before generating parrot code we might need to transform it to be two ##core#binds. 2) I also assume that ##core#setlocal always appears directly below a ##core#bind, and that there is only one ##core#setlocal under that bind. Thus, the only valid tree is (##core#bind (##core#setlocal expr) (class expr)). That is, ##core#setlocal always comes in a pair with a ##core#bind directly above it. Again, if this isn't always the case, the easiest way would be to transform the tree before (or as the first step) when generating parrot code 3) I don't really understand the purpose of the c-platform.scm file... I just copied c-platform.scm to parrot-platform.scm and it seems to work. 4) FFI does not currently work, but I believe it would be possible and simple to add it. Parrot supports what it calls the NCI interface, and we can easily map FFI to NCI http://www.parrotcode.org/docs/pdd/pdd16_native_call.html 5) ##core#inline I am implementing in the parrot-inline.scm file... I haven't started implementing them, but most of them can be just function calls into the right parrot library (like all the hash table stuff, io stuff, etc). Most of the usages in library.scm are just fine, but if the ##core#inline parameter contains more than just the function name, this won't work and will need to be changed... see point 6. 6) Some usages of foreign-lambda* won't really be portable between the C and parrot backends. Parrot does have a C interface, so for example the foreign-lambda inside tinyclos.scm could be written in about the same number of lines of code. So chicken could export whatever it finds in there, but the code that gets passed would need to use the parrot API instead of the chicken API. We need to look at and define some rules about embedding c code (that uses the chicken API). If the code uses just normal C types, we can support the function just fine using NCI. (I have also been thinking about modifying SWIG to work with parrot... that would be another option above and beyond NCI). Well, that should be enough issues for one round :) If most or all of these get worked out, we can get a full scheme compiler running on parrot, that can even compile itself! John