Dan Sugalski <[EMAIL PROTECTED]> wrote:Here's some stuff we need to add to the packfile format and the sub header to get things ready for more language work.
First: any changes here imply, that assemble.pl/disassemble.pl will seeze to work. So first step would be: grep the tree and remove all traces of these programs.
I'm fine with assemble.pl in its current incarnation going away. I'd prefer to keep disassemble.pl, or at least *some* disassembler, around. I'm not sure if we want to build disassembly functionality into Parrot. (Mainly for size of executable issues--I'm fine with shipping a disassembler as part of the parrot kit, as well as having the disassembler as a loadable library)
> Packfiles need to have a symbol table. A series of name/type/locationtuples so we can have global names that map to values in the bytecode, either variables or subroutines.
The successor/extension of the current fixup table, which has name/location.
Yep.
> ... When the bytecode isloaded, we need to put those symbols in the symbol table and construct the backing PMCs.
Which backing PMCs? The subroutine itself? And how are global symbols related with the global stash?
Sub PMCs and normal data PMCs. This is an extension of the constant PMC issue that we need to address.
> ... When we invoke asub (or closure, or continuation, or whatever) we need to put that packfile pointer and bytecode start/end pointer in place,
First is: How do we load subs/packages/modules? pdd06 has C<load_bytecode sx>, does that prederef or JIT at loadtime or should that happen on the first call into the new bytecode segment? We could setup a dummy entry which generates JIT code and replace the entry with the real code then.
JITting and pre-dereffing probably ought to happen on a per-sub basis, and be done on first entry to a sub. That'll keep us from spending a lot of time JITting libraries when we don't use most of the code in the libraries. If we want we can probably have some sort of runtime setting to JIT entire segments on load, which could be useful for preforking servers.
Also implied is: * constant table is per code segment
Yep.
* fixup table too, and we build a global symbol table when loading these symbols, so we can find them in one place.
Yep.
* this should play together with threads. We have per thread data inside the bytecode segment (prederef and JIT code). Prederef code could use an addressing scheme like [interpreter + addr], which would be thread independent. For JIT this will not work easily. I'm thinking here of a code segment PMC which has some shared data and some thread local data.
Yep. that's an issue too.
Next is: How do we actually call code in a different bytecode segment? Currently we fall out of the runloop, start/end/pc of the new code segment are set and then we restart there, at the desired address. This is for sure subotimal.
Yep. Luckily it's not actually an issue, which is nice.
Calling into another bytecode segment is simple--you just make a call to a sub/method/function that lives in that segment. The sub PMCs are either in variables, either globals or lexicals, or passed in as parameters so they're available to use. The bounds-checking runloop needs reworking anyway, since it really needs to check on a per-sub basis rather than a per-segment basis when it's being used. (And the only time it really ought to be used is when we're executing with some form of safety turned on)
> There's a bit more, but this is a good start. If people want to batthis around some, we can put together a list in the repository and start getting it implemented.
Good idea.
I'll throw something together and see about getting it checked into docs/ and we can go from there.
--
Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk