Patrick R. Michaud wrote:
On Mon, Nov 27, 2006 at 09:20:08PM -0800, Allison Randal wrote:

chromatic's suggestion is to replace the series of manual calls in HLLCompiler's 'compile' method with an iterator over an array of compiler tasks.

I very much agree with chromatic -- indeed, this is mainly why I didn't
go with putting "ostgrammar" methods into the HLLCompiler object
before.  Having HLLCompiler effectively hardcode a sequence
of parser-astgrammar-ostgrammar feels a bit heavy-handed to me,
almost saying that "we really expect you to always have exactly
the sequence source->parse->ast->ost->pir->bytecode, and you're
definitely using TGE for the intermediate steps".

The patch I sent is the first step toward making chromatic's suggestion work. The problem with the current implementation is that each stage decides what the next stage will be. If the PAST-to-POST transformation calls the POST-to-PIR transformation before returning, then you can't easily insert an additional stage between the two.

I guess if we expect a lot of compilers to be making language-specific
derivations or replacements of the ast->ost stage then putting the
ost specifications into HLLCompiler makes some sense, but I
totally agree with chromatic that a more generic approach is
needed here.  And what I had been aiming for in terms of "array
of compiler tasks" was something like "array of compiler stages",
where each compiler stage is itself a "compiler" (in the compreg
and HLL compiler sense) that does the transformation to the
next item in the list.  And each compiler stage knows the
details of how it performs its transformation, whether that's using
TGE or some other method.

I completely agree on the idea of giving each stage its own compiler, and making that compiler aware of everything it needs to know to perform its own stage of compilation. I also completely agree on putting as little code as possible for performing the compilation into the HLLCompiler module.

Where we diverge is that I don't want the compiler for one stage to know anything about the next stage. Each stage should operate independently, and only the HLLCompiler should control the order of stages.

Part of me really wishes that each compiler task would end
up being a standardized 'apply' or 'compile' subroutine
or method of each stage.  In other words, to have compilation
effectively become a sequence like:

    .local pmc code
    # source to parse tree
    $P0 = get_hll_global ['Perl6::Grammar'], 'apply'
    code = $P0(code, adverbs :flat :named)

    # parse tree to ast
    $P0 = get_hll_global ['Perl6::PAST::Grammar'], 'apply'
    code = $P0(code, adverbs :flat :named)

    # ast to ost
    $P0 = get_hll_global ['POST::Grammar'], 'apply'
    code = $P0(code, adverbs :flat :named)

    # ost to result
    $P0 = get_hll_global ['POST::Compiler'], 'apply'
    code = $P0(code, adverbs :flat :named)

Here the 'apply' functions in Perl6::PAST::Grammar and
POST::Grammar are simply imported from TGE and do the steps
of creating the builder object and then applying the grammar.
The 'apply' function in Perl6::Grammar would just be a
standardized start rule for the parser grammar (and can
be directly specified as such in the .pg file).

If we could standardize at this level, then a compiler simply
specifies the sequence of things to be applied, and the above
instructions could be implemented with a simple iterator over
the sequence. This is _really_ what I was attempting to get at by having separate compiler objects for PAST, POST, and friends, except that instead of calling the standard function 'apply' I was using 'compile'.

Hm.... actually, I like this a lot better than registering a compiler for POST and retrieving it by 'compreg'. I would push it one step farther, though. Instead of setting 'astgrammar' in HLLCompiler's 'init' method, set 'astcompiler'.

The revised method for a stage (using the parse-tree-to-AST as an example) would be as follows, where the method only performs error checks to make sure that it got a valid class name, creates a compiler object for that stage, and calls 'compile'. (Here I'm using the naming scheme from below.)

.sub 'compile_parse_tree' :method
    .param pmc source
    .param pmc adverbs         :slurpy :named
    .local string ptcompiler_name
    .local pmc ptcompiler
    ptcompiler_name = self.'ptcompiler'()
    unless ptcompiler_name goto err_no_ptcompiler
    $I0 = find_type ptgrammar_name
    ptcompiler = new $I0
    .return ptcompiler.'compile'(source)

  err_no_ptcompiler:
    $P0 = new .Exception
    $P0['_message'] = 'Missing ptcompiler in compiler'
    throw $P0
.end

For now, we create a separate compiler object for each tree grammar, but ultimately TGE could generate the appropriate 'compile' method in each generated tree grammar class.

Part of me thinks that 'apply' and
'compile' are pretty much the same thing, in the sense that both refer to using some sort of transformer "thing" to
change from a source representation into an equivalent target.

Yeah, both good, but neither seems quite right: 'apply' is so generic that it's nearly meaningless, and 'compile' is perfect when the grammar is being used as a stage in the compiler tools, but seems odd when it's being used to transform other kinds of trees. chromatic suggests 'transform' which I like best of all.

At any rate, even if we go with the approach outlined in the
patch, I have to say that I'm not at all keen on the method
names 'astcompile', 'ostcompile', etc. in the patch. When I read 'astcompile' it sounds to me like it's a method to compile an ast into something else, when in fact the method in the patch is compiling some source into an ast. (By analogy, we speak of "Perl 6 compiler" and "PIR compiler" as being things that consume Perl 6 and PIR, not the things that that produce Perl 6 or PIR.)

So at the very least I'd prefer to have those methods called
'get_ast' or 'make_ast' or something much less likely to
cause confusion.  Indeed, the reason why I went with simple
'parse' and 'ast' method names in the original is because the method name tells me what it is that I'm getting back, much like an accessor.

Yeah, I had the same problem. The reason I changed the method name from 'ast' is that I initially thought it was transforming the AST, when it was actually generating the AST (and even after I knew what it was doing the name confused me a couple times). We have the same problem in the modules too. POST::Grammar is the grammar that creates a POST tree, but POST::Compiler is the compiler that transforms a POST tree to something else.

So, when we name a particular stage, are we naming it by what it produces, or naming it by the input it takes? When we visualize the compiler stages, it's all about completed constructs: the parse tree, the AST, the OST, the PIR source, but the code is all about the transitions between the stages. How about we standardize around your concept of naming the stage by what it consumes (i.e. "Perl 6 compiler"). That would give us:

Parsing stage: method named 'parse', grammar is (for example) 'Perl6::Grammar' output is a parse tree.

Parse tree stage: method named 'compile_parse_tree', compiler is 'ParseTree::Compiler', grammar is 'ParseTree::Grammar', output is an AST.

AST stage: method named 'compile_ast', compiler is 'AST::Compiler', grammar is 'AST::Grammar', output is OST.

OST stage: method named 'compile_ost', compiler is 'OST::Compiler', grammar is 'OST::Grammar', output is PIR.

PIR stage: simple method named 'run_pir' that compiles and runs PIR code. (Could call the method 'compile_pir', it's more standard, but less clear.)

---

For Punie, I'm thinking to standardize on:
Punie::Grammar
Punie::Compiler::ParseTree
Punie::Compiler::AST
Punie::Compiler::OST

(After adding the 'compile' or 'transform' method to TGE's generator so we only need one class for each stage, instead of separate 'Compiler' and 'Grammar' classes.) Or maybe 'Punie::Grammar' should be 'Punie::Compiler::Parser' instead. 'Punie::Compiler' would be a subclass of HLLCompiler if Punie needed one, but it doesn't need one at this point.

Allison

Reply via email to