Re: Lexical implementation work

Ken Fox Tue, 13 Nov 2001 19:08:03 -0800

Dan Sugalski wrote:
> Nope, not stone tablet at all. More a sketch than anything else,
> since I'm not sure yet of all the things Larry's got in store.


Ok. I've made some more progress. There's a crude picture of
some of the internals at <http://www.msen.com/~fox/parrotguts.png>
The lexical stuff is at the bottom of the picture.

I've patched the assembler to accept syntax like:

        .begin
          .reserve 1
          fetch_lex I0, 0, 0
        .end

This is "nice" syntax for use in hand-coded assembler. I assume
that a compiler will take control of the scope definitions and
emit the enter/exit scope ops itself. (I'm throwing in the more
complicated sync_scope op too so a compiler can handle weird
jumps between scopes -- IMHO it's simpler to do this than provide
an introspective API that lets the compiler manually adjust
the scope stack.)

There's psuedo-code for the ops attached below. I'm probably
a couple evenings away from having things working.

Unless people really hate the idea, I'm going to put in "v"
operand types in ops2c.pl. They'll look like "c" types (e.g. "nc"),
but reference the current lexical scope instead of the constant
section.

QUESTIONS!

Who owns the bytecode format? How do I propose changes? I need
a "scope definition" section. Each scope is assigned a per-module
id. I'm not sure what info is needed yet, but certainly a size
and a code ref (opcode address) for the DESTROY sub.

The control stack isn't used for much and it would simplify my
code a lot if we combine the frame stack with the control stack.
The only down-side I think will be with hand-coded assembler.
There may be a redundant "restore scope" pointer on the stack
when calling a sub in the same scope. (This is impossible with
Perl code BTW.) Well, there's also a "purist" downside too --
mixing opcode addresses with lexical storage. This is the thing
that makes capturing a closure easier, so I see it as an
advantage.

Anybody care if I subsume the control stack?

Lastly, does anybody care if I change how assembler directives
are registered? I think there are going to be a whole bunch of
symbol table and debugging directives going in. The current "if"
tests are kind of clunky.

Here's the op definitions (in pseudo-code form) taken from the
copy of core.ops I'm working on:

--- parrot/core.ops     Wed Nov  7 22:34:10 2001
+++ my-parrot/core.ops  Tue Nov 13 21:13:53 2001
@@ -1930,6 +1930,159 @@
 
 ###############################################################################
 
+=head2 Lexical Variables
+
+These opcodes implement lexical scopes and PMC lexical variables.
+This functionality is only concerned with scoping and storage,
+not with the symbol tables used for translating variable names to
+storage. Compilers often use the same symbol table for managing
+both lexical and dynamic variables.
+
+First, the terminology:
+
+Lexical Scope: A contiguous region of text (code) bounded by
+.begin/.end scope directives. A lexical scope may contain any
+number of non-overlapping interior (child) scopes. The flow
+control of the code does not affect these scoping rules.
+
+Lexical Variable: A variable that is visible only by code in the
+same lexical scope, or an interior scope, that the variable is
+defined in.
+
+Frame: A group of lexical variables defined in a lexical scope.
+The frame does not include the lexical variables defined in
+interior scopes -- each interior scope requires its own frame.
+
+Frame ID: The position of a frame relative to either the current
+scope (non-positive frame IDs), or the outer-most scope (positive
+IDs).
+
+Variable ID: The non-negative position of a variable relative to
+the start of a frame.
+
+Scope ID: A unique positive number assigned by the assembler or
+compiler to each lexical scope. Information about the scope, such
+as how much space is required for a frame, is retrieved using the
+scope ID.
+
+=over 4
+
+=item B<fetch_lex>(out p variable, i|ic frame_id, i|ic variable_id)
+
+=item B<store_lex>(i|ic frame_id, i|ic variable_id, p variable)
+
+Note: While the PMC code is being developed, lexicals hold integers
+instead of PMCs. This changes the usage of lexicals because PMC
+lexicals will not need to be stored back to the frame.
+
+=item B<enter_scope>(ic scope_id)
+
+=item B<exit_scope>()
+
+=item B<sync_scope>(ic scope_id)
+
+=item B<.begin> [ID]
+
+B<.begin> is a pseudo-op that does two things: it begins a lexical
+scope at the current position in the code and it emits an
+B<enter_scope> op for the current scope. If ID is not provided, the
+assembler will generate one automatically.
+
+=item B<.reserve> variable_count
+
+=item B<.end>
+
+B<.end> is a pseudo-op that does two things: it ends the current
+lexical scope (returning to the enclosing lexical scope) and it
+emits an B<exit_scope> op.
+
+=back
+
+=cut
+
+AUTO_OP fetch_lex(i, i|ic, i|ic) {
+  /*
+     int frame_id = $2;
+     int variable_id = $3;
+
+     if (frame_id <= 0) {
+        frame_id += interpreter->frame_display->used;
+     }
+
+     $1 = interpreter->frame_display->frame[frame_id][variable_id].int_val;
+
+   */
+}
+
+AUTO_OP store_lex(i|ic, i|ic, i) {
+  /*
+     int frame_id = $1;
+     int variable_id = $2;
+
+     if (frame_id <= 0) {
+        frame_id += interpreter->frame_display->used;
+     }
+
+     interpreter->frame_display->frame[frame_id][variable_id].int_val = $3;
+
+   */
+}
+
+AUTO_OP enter_scope(ic) {
+  /*
+     lookup scope info from $1
+
+     unless (static/global frame) {
+        unless (frame fits on interpreter's frame_stack) {
+           allocate a new Frame_Stack_Chunk
+        }
+        allocate frame lexicals on frame_stack
+     }
+
+     push frame restore info on frame_stack
+     push frame on frame_display
+
+   */
+}
+
+AUTO_OP exit_scope() {
+  /* 
+     pop frame off frame_display
+     pop frame restore info off frame_stack
+
+     if (frame lexicals on frame_stack) {
+        release frame lexicals on frame_stack
+        call scope DESTROY on frame
+     }
+
+   */
+}
+
+AUTO_OP sync_scope(ic) {
+  /*
+     lookup to-be scope info from $1
+     lookup as-is scope info from top of frame_stack
+
+     while (to-be depth > as-is depth) {
+        back up to to-be parent
+     }
+
+     while (as-is depth > to-be depth) {
+        exit_scope
+     }
+
+     while (as-is != to-be) {
+        back up to to-be parent
+        exit_scope
+     }
+
+     now enter_scope back into to-be
+
+   */
+}
+
+###############################################################################
+

Re: Lexical implementation work

Reply via email to