[racket] phases

Jon Rafkind Thu, 01 Mar 2012 12:36:03 -0800

Recent problems with phases have led me to investigate how they work in more 
detail. Here is a brief tutorial on what they are and how they work with 
macros. The guide and reference have something to say about phases but I don't 
think they go into enough detail.


Bindings exist in a phase. The link between a binding and its phase is 
represented by an integer. Phase 0 is the phase used for "plain" definitions, so

(define x 5)

Will put a binding for 'x' into phase 0. 'x' can be defined at higher phases 
easily

(begin-for-syntax
  (define x 5))

Now 'x' is defined at phase 1. We can easily mix these two definitions in the 
same module, there is no clash between the two x's because they are defined at 
different phases.

(define x 3)
(begin-for-syntax
  (define x 9))

'x' at phase 0 has a value of 3 and 'x' at phase 1 has a value of 9.

Syntax objects can refer to these bindings, essentially they capture the 
binding as a value that can be passed around.

#'x

Is a syntax object that represents the 'x' binding. But which 'x' binding? In 
the last example there are two x's, one at phase 0 and one at phase 1. Racket 
will imbue #'x with lexical information for all phases, so the answer is both!

Racket knows which 'x' to use when the syntax object is used. I'll use eval 
just for a second to prove a point.

First we bind #'x to a pattern variable so we can use it in a template and then 
just print it.
(eval (with-syntax ([x #'x])
        #'(printf "~a\n" x)))

This will print 3 because x at phase 0 is bound to 3.

(eval (with-syntax ([x #'x])
        #'(begin-for-syntax
            (printf "~a\n" x))))

This will print 9 because we are using x at phase 1 instead of 0. How does 
Racket know we wanted to use x at phase 1 instead of 0? Because of the 
'begin-for-syntax'. So you can see that we started with the same syntax object, 
#'x, and was able to use it in two different ways -- at phase 0 and at phase 1.

When a syntax object is created its lexical context is immediately set up. When 
a syntax object is provided from a module its lexical context will still 
reference the things that were around in the module it came from.

This module will define 'foo' at phase 0 bound to the value 0 and 'sfoo' which 
binds the syntax object for 'foo'.

;; a.rkt
(define foo 0)
(provide (for-syntax sfoo))
(define-for-syntax sfoo #'foo)
;; why not (define sfoo #'foo) ? I will explain later

;; b.rkt
(require "q.rkt")
(define foo 8)
(define-syntax (m stx)
  sfoo)
(m)

The result of the (m) macro will be whatever value 'sfoo' is bound to, which is 
#'foo. The #'foo that 'sfoo' knows that 'foo' is bound from the a.rkt module at 
phase 0. Even though there is another 'foo' in b.rkt this will not confuse 
Racket.

Note that 'sfoo' is bound at phase 1. This is because (m) is a macro so its 
body executes at one phase higher than it was defined at. Since it was defined 
at phase 0 it will execute at phase 1, so any bindings it refers to also need 
to be bound at phase 1.

Now really what I want to show is how bindings can be confused when modules are 
imported at different phases. Racket allows us to import a module at an 
arbitrary phase using require.

(require "a.rkt") ;; import at phase 0
(require (for-syntax "a.rkt")) ;; import at phase 1
(require (for-template "a.rkt")) ;; import at phase -1
(require (for-meta 5 "a.rkt" )) ;; import at phase 5

What does it mean to 'import at phase 1'? Effectively it means that all the 
bindings from that module will have their phase increased by one.

;; c.rkt
(define x 0) ;; x is defined at phase 0

;; d.rkt
(require (for-syntax "c.rkt"))

Now in d.rkt there will be a binding for 'x' at phase 1 instead of phase 0.

So lets look at a.rkt from above and see what happens if we try to create a 
binding for the #'foo syntax object at phase 0.

;; a.rkt
(define foo 0)
(define sfoo #'foo)
(provide sfoo)

Now both 'foo' and 'sfoo' are defined at phase 0. The lexical context of #'foo 
will know that there is a binding for 'foo' at phase 0. In fact it seems like 
things are working just fine, if we try to eval sfoo in a.rkt we will get 0.

(eval sfoo)
--> 0

But now lets use sfoo in a macro.

(define-syntax (m stx)
  sfoo)
(m)

We get an error 'reference to an identifier before its definition: sfoo'. 
Clearly 'sfoo' is not defined at phase 1 so we cannot refer to it inside the 
macro. Lets try to use 'sfoo' in another module by importing a.rkt at phase 1. 
Then we will get 'sfoo' at phase 1.

;; b.rkt
(require (for-syntax "a.rkt")) ;; now we have sfoo at phase 1
(define-syntax (m stx)
  sfoo)
(m)

$ racket b.rkt
compile: unbound identifier (and no #%top syntax transformer is bound) in: foo

Racket says that 'foo' is unbound now. When 'a.rkt' is imported at phase 1 we 
have the following bindings

foo at phase 1
sfoo at phase 1

So the macro 'm' can see sfoo and will return the #'foo syntax object which 
knows that 'foo' was bound at phase 0. But there is no 'foo' at phase 0 in 
b.rkt, there is only a 'foo' at phase 1, so we get an error. That is why 'sfoo' 
needed to be bound at phase 1 in a.rkt. In that case we would have had the 
following bindings after doing (require "a.rkt")

foo at phase 0
sfoo at phase 1

So we can still use 'sfoo' in the macro since its bound at phase 1 and when the 
macro finishes it will refer to a 'foo' binding at phase 0.

If we import a.rkt at phase 1 we can still manage to use 'sfoo'. The trick is 
to create a syntax object that will be evaluated at phase 1 instead of 0. We 
can do that with 'begin-for-syntax'.

;; a.rkt
(define foo 0)
(define sfoo #'foo)
(provide sfoo)

;; b.rkt
(require (for-syntax "a.rkt"))
(define-syntax (m stx)
  (with-syntax ([x sfoo])
    #'(begin-for-syntax
        (printf "~a\n" x))))
(m)

b.rkt has 'foo' and 'sfoo' bound at phase 1. The output of the macro will be

(begin-for-syntax
  (printf "~a\n" foo))

Because 'sfoo' will turn into 'foo' when the template is expanded. Now this 
expression will work because 'foo' is bound at phase 1.

Now you might try to cheat the phase system by importing a.rkt at both phase 0 
and phase 1. Then you would have the following bindings

foo at phase 0
sfoo at phase 0
foo at phase 1
sfoo at phase 1

So just using sfoo in a macro should work

;; b.rkt
(require "a.rkt"
         (for-syntax "a.rkt"))
(define-syntax (m stx)
  sfoo)
(m)

The 'sfoo' inside the 'm' macro comes from the (for-syntax "a.rkt"). For this 
macro to work there must be a 'foo' at phase 0 bound, and there is one from the 
plain "a.rkt" imported at phase 0. But in fact this macro doesn't work, it says 
'foo' is unbound. The key is that "a.rkt" and (for-syntax "a.rkt") are 
different instantiations of the same module. The 'sfoo' at phase 1 only knows 
that about 'foo' at phase 1, it does not know about the 'foo' bound at phase 0 
from a different instantiation, even from the same file.

So this means that if you have a two functions in a module, one that produces a 
syntax object and one that matches on it (say using syntax/parse) the module 
needs to be imported once at the proper phase. The module can't be imported 
once at phase 0 and again at phase 1 and be expected to work.

;; x.rkt
#lang racket

(require (for-syntax syntax/parse)
         (for-template racket/base))
                  
(provide (all-defined-out))

(define foo 0)
(define (make) #'foo)
(define-syntax (process stx)
(define-literal-set locals (foo))
  (syntax-parse stx
    [(_ (n (~literal foo))) #'#''ok]))

;; y.rkt
#lang racket

(require (for-meta 1 "q6.rkt")
         (for-meta 2 "q6.rkt" racket/base)
         ;; (for-meta 2 racket/base)
         )
         
(begin-for-syntax
  (define-syntax (m stx)
    (with-syntax ([out (make)])
      #'(process (0 out)))))
    
(define-syntax (p stx)
  (m))

(p)

$ racket y.rkt
process: expected the identifier `foo' at: foo in: (process (0 foo))

'make' is being used in y.rkt at phase 2 and returns the #'foo syntax object 
which knows that foo is bound at phase 0 inside y.rkt, and at phase 2 from 
(for-meta 2 "q6.rkt"). The 'process' macro is imported at phase 1 from 
(for-meta 1 "q6.rkt") and knows that foo should be bound at phase 1 so when the 
syntax-parse is executed inside 'process' it is looking for 'foo' bound at 
phase 1 but it sees a phase 2 binding and so doesn't match.

To fix this we can provide 'make' at phase 1 relative to x.rkt and just import 
it at phase 1 in y.rkt

;; x.rkt
#lang racket

(require (for-syntax syntax/parse)
         (for-template racket/base))
                  
(provide (all-defined-out))

(define foo 0)
(provide (for-syntax make))
(define-for-syntax (make) #'foo)
(define-syntax (process stx)
(define-literal-set locals (foo))
  (syntax-parse stx
    [(_ (n (~literal foo))) #'#''ok]))

;; y.rkt
#lang racket

(require (for-meta 1 "q6.rkt")
         ;; (for-meta 2 "q6.rkt" racket/base)
         (for-meta 2 racket/base)
         )
         
(begin-for-syntax
  (define-syntax (m stx)
    (with-syntax ([out (make)])
      #'(process (0 out)))))
    
(define-syntax (p stx)
  (m))

(p)

$ racket y.rkt
'ok
____________________
  Racket Users list:
  http://lists.racket-lang.org/users

[racket] phases

Reply via email to