Table of contents 1. Deep binding is not appropriate. 2. Outline of a shallow-binding solution. 3. Unbinding must work with a general stack-unwinding mechanism. 4. Conclusion (not).
1. Deep binding is not appropriate. It has always been clear that a "save/modify/restore" mechanism for at least some of the functions of Perl5 "local"/Perl6 "temp/let" will be needed. Such a mechanism is likely to be complicated (see below), which is why I had tried to solve part of the problem independently with a relatively lightweight solution. (Indeed, I doubt I grok "temp/let" yet, so it's probably more complicated than I think.) Because of this complexity, and because shallow binding can subsume the special case of binding symbol table values, there is little advantage to having a parallel mechanism solely for binding globals. Indeed, there are enough things that interact with stack-unwinding as it is. So, please consider my previous proposal withdrawn. (Sigh.) That leaves us with shallow binding. 2. Outline of a shallow-binding solution. "Shallow binding" essentially just means the straightforward "save/modify/restore" approach. During execution of a particular stretch of code in a given context, that context's dynamic bindings are in place in the heap, and any alien being who can reach into Parrot's memory (gdb, for one) would find the values established by local/temp. This means that we must ensure that those dynamic bindings are undone when exiting the context, even temporarily, and redone when re-entering. There are a number of cases: A. Sub/method call. This is the simplest; we are creating a new context that needs to inherit the old one, but otherwise the environment stays the same. B. Sub/method return. The current context's dynamic bindings must be undone, effectively popping them off. C. Continuation call. This involves leaving one context (without necessarily exiting it permanently) and entering another. The simplest case is just B, but in general the two contexts, call them X and Y, do not necessarily have any particular relationship. We need to (1) find the common ancestor of X and Y, call it Q; (2) undo the dynamic bindings between X and Q in reverse order, but without throwing them away; and (3) redo dynamic bindings between Q and Y in forward order. Call this operation "rezipping" the dynamic binding stack. D. Coroutine yield. This turns out to be equivalent to C, except that the "from" context is not abandoned. E. Throwing an exception. This is simpler than the general continuation case, since the throw is always to an ancestor, but with an interesting wrinkle. A sub may make bindings before pushing a handler, or afterwards, or both, so the Exception_Handler object must somehow record the dynamic binding state when it is created. Care must be taken to avoid allowing the control stack to get out of phase with respect to the binding stack. One way to do this would be to use the control stack for dynamic bindings, forcing bindings and handlers to take note of each other, but this requires further thought. [I had thought that context switching between threads would be another such case, but the ithreads model saves us the trouble; I didn't understand that at the time I made my previous posts. Dynamic binding of shared variables needs to be handled correctly, especially with respect to locking, but I expect this can be addressed independently.] So most of the complexity boils down to continuations and exceptions. Note that if the common ancestor Q has dynamic bindings in effect when the calls leading to X and Y are made, then we don't want to undo/redo all of Q's bindings; really, we want to find the common ancestor of the dynamic binding stacks, and not that of the contexts. Rezipping can be done in time proportional to the number of bindings that must be undone/redone if we keep a depth counter, similar to the recursion_depth field in struct Parrot_Context, in each dynamic binding record. Let us call this structure a Parrot_DBR (for dynamic binding record). The Parrot_DBR therefore needs the following: typedef struct _parrot_dynamic_binding_record { Parrot_DBR *prev; /* backpointer to previous binding. */ INTVAL depth; /* stack depth, used for rezipping. */ PMC *location; /* a generalized location PMC, not NULL. */ PMC *value; /* the saved value; NULL means unbound. */ } Parrot_DBR; Note that we cannot make this a doubly-linked list; coroutines would make it branch, so there might be multiple forward pointers. However, it would be handy to have a temp_next slot for rezipping, to make it easier to go forward to the destination context from the common ancestor. The saved value slot contains the old value when the binding is in effect, and the newly bound value otherwise. Rezipping past a Parrot_DBR in either direction therefore consists of swapping the heap value with the saved value. Most of the magic is hidden in the location PMC, which records a generalized notion of a place to be modified [0]. There are two basic operations: Fetch the location contents (which is always returned as a PMC [1]), and store a new value into the location (also always a PMC). For a global, the location records the namespace name (or namespace object?) and the global name; for an arbitrary structure reference, the location records the structure and the index/key. That's two new PMC classes so far; Perl5 typeglobs will probably require at least one more (whenever that's defined). More still will be required for temporizing object attributes [2], but the additional complexity in these cases gets swept neatly under the rug provided by the location PMC. OO to the rescue, again. Note that not all locations (e.g. arrays) will support a notion of "unbound" -- it is the location object's job to pitch an exception if somebody asks to store a NULL there, which will happen immediately on binding. Othewise, the location must convert back and forth between the NULL stored in the Parrot_DBR and the appropriate implementation. 3. Unbinding must work with a general stack-unwinding mechanism. A key requirement for shallow-binding (and the reason I had shied away from it at first) is that the interpreter must realize when it needs do additional work when leaving a context in order to restore state. This requirement is shared by "throw" and by the cleanup implied for Perl6 'post', 'undo', and the like [3]: my $f will post { close } will undo { unlink $file } = open ">$file" or die; The curlies contain closures that must be invoked under the right circumstances on block exit. Part of the definition of "right circumstances" is that any dynamic bindings and error handlers be restored to the state they were at the time the closures were taken (i.e. right after $f entered scope), so all such dynamic cleanup daemons must interact cleanly with each other and with all stack-unwinding operations [4]. Fortunately for us, the stack unwinders reduce to 'throw' and continuation calling. Note that, because of rezipping, dynamic binding still requires some extra mechanism. Since a given dynamic binding might be disestablished and reestablished an arbitrary number of times before being finally undone, a pure POST block doesn't cut it. 4. Conclusion (not). If this seems like the right thing, I will pursue it, and publish a detailed design in due course. But please don't hold your breath, as I won't have much time for it; my "Parrot hacking window" is closing, and work is likely to keep me very busy for the next few months. (Sigh.) -- Bob Rogers http://rgrjr.dyndns.org/ Notes [0] Interestingly, this is equivalent to the generalized reference PMC that was discussed in the "morph()ing" thread last April, <http://xrl.us/fwkx> [1] Even though the Parrot_DBR interface requires a PMC, the location object could be set up to fetch/replace I/S/N values internally. [2] About two-thirds of the way through A06 (search for "temporize object attributes"), Larry says that this will be done via closures. In order to support rezipping, such a closure would need to accept a new value to store and return the old value. Or maybe the .TEMP method could just return a location PMC? [3] From A04, about the middle (search for "time passes"). [4] It appears that pushaction, pushmark, and popmark were intended to provide some of this cleanup mechanism, but they don't seem up to the task. More on this later.