Re: [racket-users] Compilation/Embedding leaves syntax traces

Alexis King Tue, 25 Sep 2018 11:11:28 -0700

(Sorry, Paulo, for the duplicate message; I forgot to Reply All the
first time.)

This is sort of subtle. When we consider a macro-enabled language, we
often imagine that `expand` takes a program with some phase ≥1 code,
expands all the macros in the program by running the phase ≥1 code, and
produces a fully-expanded program with only phase 0 code left. There is
some truth to this, but it doesn’t paint the whole picture.

Let’s start with the things that ARE true:

   1. When a module is compiled, it is fully expanded.

   2. Fully-expanded code contains no macro uses.

   3. Instantiating a compiled module at phase 0 does not normally run
      any phase ≥1 code, unless the module uses reflective operations
      like dynamic-require that may trigger compilation of other
      modules at runtime or explicitly instantiate modules into a
      namespace at phase ≥1.

These three things align with our intuition. If you have the program

   (+ (mac) 1 2)

where `mac` is a macro, then when the module is compiled, the use of
`mac` goes away, and it is replaced with its expansion.

Now, let’s add one more true thing to the list that aligns with our
intuition, but hints at something more complicated:

   4. When a module is expanded, all LOCAL macro definitions disappear.

This means that if you define a macro with let-syntax (or, equivalently,
define-syntax in an internal definition context), then all of the code
that implements that macro goes away after expansion. This is consistent
with our intuition, but it begs the question: why does this only happen
for local macros? Shouldn’t this happen for all macros?

Sadly, no. Consider the following module:

   (module m racket
     (provide mac)
     (define-syntax (mac stx)
       ....))

In this case, the RHS of the `mac` definition must remain in the
compiled code, since some other module could require `m` and use `mac`.
Although the RHS of the `mac` definition is not evaluated when `m` is
instantiated at phase 0 (as is specified by rule 3 above), it must be
evaluated during compilation of another module that uses `m`.

(The technical term for this in Racket is called “visiting” the module.
This process of evaluating the RHS of define-syntax forms during module
visits also applies to any forms inside begin-for-syntax blocks, and a
visit also instantiates any modules required for-syntax by the visited
module. The nitty-gritty details are subtle, but this explains why code
on the RHS of module-level define-syntax forms or inside
begin-for-syntax blocks must be kept around in compiled code.)

The above explains why Racket retains some phase ≥1 code. However, it
may be unsatisfying: while it’s true that the phase ≥1 code might be
necessary for compilation of other modules, once you have compiled your
whole program, it shouldn’t be necessary to keep that information
around, right? No other modules will ever need to be compiled against
the macro-providing module. However, this is not necessarily true!
Racket provides a set of reflective operations for compiling modules at
runtime, and it makes no assumptions that all modules will be loaded
from compiled code. In this sense, Racket includes an “open-world
assumption” when compiling modules, and it retains any phase ≥1 code
necessary for compiling new modules at any time.

This sort of thing is necessary to implement tools like DrRacket, which
frequently compile new modules at runtime, but admittedly, most programs
don’t do any such thing. Personally, I would appreciate a way to ask
Racket to strip all phase ≥1 code and phase ≥1 dependencies from a
specified program so that I can distribute the phase 0 code and
dependencies exclusively. However, to my knowledge, Racket does not
currently include any such feature.

For more information on declaring, instantiating, and visiting modules,
and how that relates to compilation, see this very helpful section in
The Racket Guide:

   http://docs.racket-lang.org/guide/macro-module.html

Alexis

> On Sep 25, 2018, at 07:32, 'Paulo Matos' via Racket Users 
> <racket-users@googlegroups.com> wrote:
> 
> 
> Hi,
> 
> I reached a point at which I don't think I am exactly understanding how
> the racket compilation pipeline works.
> 
> My software has several compile time options that use environment
> variables to be read (since I can't think of another way to do it) so I
> define a compile time variable as:
> 
> (define-for-syntax enable-contracts?
> (and (getenv "S10_ENABLE_CONTRACTS") #true))
> 
> And then I create a macro to move this compile-time variable to runtime:
> (define-syntax (compiled-with-contracts? stx)
> (datum->syntax stx enable-contracts?))
> 
> I have a few of these so when I create a distribution, I first create an
> executable with (I use create-embedding-executable but for simplicity,
> lets say I am using raco):
> S10_ENABLE_CONTRACTS=1 raco exe ...
> 
> I have a bunch of other options that don't matter for the moment.
> 
> One of the things I noticed is that in some cases when I run my
> executable, compile time code living inside begin-for-syntax to check if
> a variable has been defined during compilation or not is triggered. At a
> point, which I didn't expect any more syntax expansion to occur.
> 
> I can't really reproduce the issue with a small example yet but I
> noticed something:
> 
> main.rkt:
> 
> #lang racket
> 
> (require (file "arch-choice.rkt"))
> 
> (module+ main
> (printf "arch: ~a~n" (get-path)))
> 
> arch-choice.rkt:
> 
> #lang racket
> 
> (provide get-path)
> 
> (begin-for-syntax
> 
> (define arch-path (getenv "ARCH"))
> 
> (unless arch-path
>   (raise-user-error 'driver "Please define ARCH with a suitable path")))
> 
> (define-syntax (get-path stx)
> (datum->syntax stx arch-path))
> 
> Then just to make sure nothing is compiled I remove my zos:
> $ find . -type f -name '*.zo' -exec \{\} \;
> 
> Then compile it:
> $ ARCH=foo raco exe main.rkt
> 
> In this case if you run ./main you'll get 'arch: foo' back which is fine
> so I can't reproduce what I see in my software which is with some
> combinations of compile time options, I see:
> 'driver: Please define ARCH environment variable'
> 
> which should even be part of the executable because it's a compile time
> string (or so I thought).
> 
> So I did on the above example:
> $ strings main | grep ARCH
> PLANET-ARCHIVE-FILTER
> ARCH"
> ''Please define ARCH with a suitable path
> 
> 
> OK, so, this agrees with what I see in my program: compile-time error
> strings still exist in the code. Why is that? I thought that only fully
> expanded code (compiled-code) would make it to the executable file.
> 
> Another thing that might help me understand what's going on, is there a
> way to extract the bytecode from the executable and decompile it?
> 
> Thanks,
> 
> 
> -- 
> Paulo Matos

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [racket-users] Compilation/Embedding leaves syntax traces

Reply via email to