Since this has gotten to be an issue...

We've definitely got performance issues with continuations. This is Not Good, since we're using them for control flow. Leo's hacked in a performance fix, but it's got its own issues. I think it's time for some thought, and a more unified solution.

Making a continuation conceptually has three steps. One must:

1) Copy the current environment contents to the continuation
2) Note the bytecode address at which the continuation should run
3) Mark the stacks as COW

Step 3 is universally required *however* we can skip it *if* we mandate that return continuations can't be used for anything other than returning. I'm not sure that's a good idea, but we can do it. We can also do it transparently if it turns out that it's a bad idea.

Allocating continuations quickly is important enough that I think they warrant a separate arena with specifically sized PMCs. (with each allocatable item being sizeof(PMC)+sizeof(environment) so there's no need for memory allocation at continuation allocation time) Creating a continuation, if step 3 above is skipped, then has the cost of a single PMC allocation from a free list and a memcpy of the environment chunk of the interpreter structure.

Part of the problem we're seeing with continuations has to do with the stacks. We've got a chunked stack frame system, with multiple frames fitting into a single stack chunk. This made sense when I first did it, but that was years and a number of significant design decisions ago. At this point I think it's a sub-optimal decision. So it's time to fix it.

What I'd like to do is switch from the current chunked system (which made sense when it was first done, but things have changed a lot since then) to a single-frame system, and make 'em PMCs to boot. That is, rather than having the stack allocated in chunks that can hold multiple pushes, we make each push to the stack live in its own independent structure that's linked to the previous top-of-stack element. If we make these PMCs as well then we get it all nicely DOD-able and GC-able without any (well, much) special code. The "this is a buffer of PObj pointers" flag will work nicely for this too, as it'll mean we won't even need to bother with a separate scanning code path for this stuff.

This should all be transparent, as it's API-protected, but it'll mean some internal rework, so now's the time to dig into the discussion on it.
--
Dan


--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to