Re: [PATCH] Implement local-eval, local-compile, and the-environment

Mark H Weaver Sat, 07 Jan 2012 12:21:30 -0800

Hi Andy, thanks for the code review! :)

Andy Wingo <wi...@pobox.com> writes:
> On Tue 03 Jan 2012 11:52, Mark H Weaver <m...@netris.org> writes:
>
>> +(define-record-type lexical-environment-type
>> +  (make-lexical-environment wrapper names boxes others module-name)
>> +  lexical-environment?
>> +  (wrapper     lexenv-wrapper)
>> +  (names       lexenv-names)
>> +  (boxes       lexenv-boxes)
>> +  (others      lexenv-others)
>> +  (module-name lexenv-module-name))
>
> Why a module-name and not a module?


I guess this could go either way, but the module-name is what gets
embedded into the compiled code for (the-environment), since the module
itself is not serializable.  If we want to put the module itself in the
lexical environment, then (the-environment) would need to generate a
call to `resolve-module' within its compiled code.

>> +(define-syntax-rule (box v)
>> +  (case-lambda
>> +   (() v)
>> +   ((x) (set! v x))))
>
> This is nice.
>
> If you want to make a hack, you can get at the variable object here,
> using program-free-variables.  It is fairly well guaranteed to work.
> Dunno if it is needed, though.

The problem is that primitive-eval does not represent lexical variables
as variable objects.  We could change that, but I'm reluctant to make
the evaluator any slower than it already is.

More importantly, is there any guarantee that mutable lexicals will
continue to be represented as variable objects in future native code
compilers?  Do we want to commit to supporting this uniform
representation in all future compilers?

The nice thing about this strategy is that it places no constraints
whatsoever on how lexical variables are represented.  It allows us to
use primitive-eval to evaluate a local expression within a lexical
environment created by compiled code, or vice versa.

In the more complex patch, the local expression must use the same
execution method as the code that created the lexical environment.

>> +(define-syntax-rule (box-lambda* (v ...) (other ...) e)
>> +  (lambda (v ...)
>> +    (let-syntax
>> +    ((v (identifier-syntax-from-box v))
>> +     ...
>> +     (other (unsupported-binding 'other))
>> +     ...)
>> +      (if #t e))))
>
> I would fix the indentation here.

Sorry, until a couple of days ago I had `indent-tabs-mode' set to `t'
(the default) in Emacs.  I finally fixed that.  The indentation actually
looks correct outside of a patch, but because it contains tabs, the "+"
and "> " prefixes mess it up.  I'll make sure to untabify my code before
committing.

> What's the purpose of the (if #t e) ?

That's to force expression context.  There's no proper way to add new
definitions to an existing local environment anyway.  (the-environment)
is treated like an expression, thus terminating definition context.
Therefore, the form passed to `local-eval' should be constrained to be
an expression.

BTW, I think I want to change (if #t e) to: #f e.  That should require a
less complicated analyzer to optimize away.

Is there a better way to force expression context?

>> +(define-syntax-rule (capture-environment
>> +                 module-name (v ...) (b ...) (other ...))
>> +  (make-lexical-environment
>> +   (lambda (expression)
>> +     #`(box-lambda*
>> +    #,'(v ...)
>
> Why not just (v ...) ?
>
>> +    #,'(other ...)
>
> Likewise.

This is the definition of the wrapper procedure, that wraps
`box-lambda*' around the local expression.  Therefore, I need the list
of source names, not the gensyms.

#,#'(v ...) may be equivalent is (v ...), but #,'(v ...) is quite
different.  '(v ...) strips the wrap from the variable names, resulting
in the source names.

>> +(define-syntax-rule (identifier-syntax-from-box b)
>> +  (let ((trans (identifier-syntax
>> +            (id          (b))
>> +            ((set! id x) (b x)))))
>> +    (set-procedure-property! trans
>> +                         'identifier-syntax-box
>> +                         (syntax-object-of b))
>> +    trans))
>
> Ew, identifier-syntax as a value :)  But OK.

This code is based on `make-variable-transformer' in psyntax.scm, which
does something very similar.  Both set a procedure-property on the
transformer.  In the case of `make-variable-transformer', the property
indicates that the associated syntactic keyword can be used as the first
operand to `set!'.

In this case, we need a way to detect that a given syntactic keyword
represents a simulated variable bound by `box-lambda*'.  Furthermore, we
need to find the associated box so that we can reuse it.  This allows us
to avoid nested boxes, and thus allows us to achieve O(1) overhead for
simulated variable references, as opposed to O(n) where N is the nesting
depth.

>> +;; XXX The returned syntax object includes an anti-mark
>> +;;     Is there a good way to avoid this?
>> +(define-syntax syntax-object-of
>> +  (lambda (form)
>> +    (syntax-case form ()
>> +      ((_ x) #`(quote #,(datum->syntax #'x #'x))))))
>
> This has a bad smell.  Anything that is outside psyntax should not have
> to think about marks.  Perhaps syntax-local-value could serve your
> purpose?

The anti-mark turned out to be harmless so I removed that scary comment,
as well as the `remove-anti-mark' stuff from psyntax.  BTW, the purpose
of this is to store the identifier of the box corresponding to a
simulated variable, so that it can be captured by `the-environment'
without creating a nested box.

>> +    (global-extend 'core 'the-environment
>
> This one is really nasty, and I'd like to avoid it if possible.  Are
> there some short primitives that psyntax could export that would make it
> possible to implement `the-environment' in a module?

I don't see how it could be made much simpler.  At the very least, it
needs to generate the following things:

* The module name
* The list of ordinary variables (these need to be boxed)
* The list of simulated variables (we need to reuse the original box)
* The list of others, i.e. unsupported lexical bindings

And that's what most of that code is for.

More importantly, if I'm going to support capturing locally-bound
syntactic keywords or pattern variables, I'll need to generate more
complex `wrapper' procedures.  At that point, simple lists of variables
will no longer do the job; in general I will need to wrap the local
expression (passed to `local-eval') within multiple levels of binding
constructs.

I don't know of a good way to break this apart into lower-level
primitives.

I could, however, write the code in a more elegant (though less
efficient) way.  For example, instead of the complicated loop that
generates three lists at once, I could instead use `map' and `filter' to
produce each list separately.  I'll take a look.

Though again, this will unfortunately become quite a bit more
complicated if we wish to support capturing local syntax.  The
alternative is to simply capture the psyntax environment structures
directly, but that has its own problems: it would require embedding
references to transformer procedures within compiled code.

BTW, I should note that the question of which strategy to use for
capturing the expander environment (capturing the internal psyntax
structures vs creating wrapper procedures) is orthogonal to the question
of how to implement boxes (using procedures and identifier-syntax, or
using variable objects as is done in my more complex patch).

   Thanks,
     Mark

Re: [PATCH] Implement local-eval, local-compile, and the-environment

Reply via email to