Re: Initial feedback on PAST-pm, or Partridge

Allison Randal Mon, 27 Nov 2006 01:14:15 -0800

I'll split my replies into separate threads to make it easier to wrapour brains around individual chunks.


Patrick R. Michaud wrote:

Clear boundaries between components: (Fuzzy boundaries of abstractionmake it difficult to allow for other implementations of the AST/OST orcustomization of the compiler object.)
- The 'compile' method doesn't belong in the PAST object, it belongs inHLLCompiler.
...


After a lot of thought and false starts, I ended up taking a
different approach to compilation than the "HLLCompiler specifies
the complete sequence of transformations".  Essentially I've taken
the approach that a "compiler" is simply something that transforms
a source data structure into a target data structure, and so
what we really have is a sequence of "compilers".  To this end,
I really wanted to call my compiler base class 'Compiler'
and not 'HLLCompiler', but unfortunately that classname is already
used by Parrot for something else and so 'HLLCompiler' is what I
chose until that could be resolved.  The 'HLL' probably implies
more than I intended to imply.

So, the 'Abc' compiler really is just something that converts the

'bc' language into a PAST structure, after doing that it simplyhands the result off to the 'PAST-pm' compiler. Similarly,

the 'PAST' compiler translates into POST and hands the result
off to the POST compiler, and POST simply does its thing and
returns a PIR or executable result.

Let's take a couple steps back. The compiler module is really likeTest::Builder. It's the infrastructure code that provides standardfunctionality to all compiler writers. Standardization is good, it meanswe don't have 500 incompatible implementations of 'ok'. (Actually, westill have non-standard implementations of 'ok' floating around, andthey're a major headache. All the more reason to standardize thecompiler tools early on.)

With tests, each test file does one thing (tests a chunk of code, says'ok' or 'not ok' multiple times). The individual tests don't need toeach duplicate the infrastructure code. Test::Harness provides theinfrastructure, progresses through all the tests, maintainsmeta-information as it goes, and summarizes at the end.

With compiler modules, the individual PGE and TGE modules each do onething, take in the "source code" in one form and output it in anotherform. There's no need to re-write the infrastructure code into thesyntax tree modules for every stage of compilation. LetCompiler::Builder (or Compiler::Harness, or whatever we call it) handlethe infrastructure.

- The 'compile' method also doesn't belong in the main compilerexecutable, it belongs in HLLCompiler.
- Merge them into one 'compile' method in HLLCompiler.
- Customization of HLLCompiler should be handled by creating a subclassof HLLCompiler. (The current 'register' strategy is somewhat fragile.)
I don't have any problem with having each language subclass
HLLCompiler and override the 'compile' method in each, I'll
work on that soon.  Of course, the method still ends up one way
or another in the main compiler executable, it may simply change
the namespace.

The point is that 99% of compiler writers shouldn't need to write anycode for the 'compile' method at all.

- Provide an 'init' method for HLLCompiler that lets the compiler writerset which modules HLLCompiler will use for each stage of compilation.This will cover the majority of compilers without requiring eachcompiler writer to define their own 'compile' routine.
Because of the multi-stage approach I've taken, the compile
routines are already fairly short, and to me they're not at all
onerous for a compiler writer to create.  For each of languages/abc/,
languages/APL/, and languages/perl6/ the 'compile' method is
less than 30 lines of PIR.  (And it will only require a couple
of lines of code to abstract the existing call to 'compile' methods
of PAST/POST to instead use PAST/POST compilers.)

a) Most compilers will simply cut-n-paste an existing 'compile' routinefrom an existing compiler. Cut-n-paste programming is a "code smell" anda maintenance headache.

b) Why require the compiler writer to write 30 lines of code when theycould write one? The entire core executable for a compiler could consistof nothing but:


.sub '__onload' :load :init
    # load your modules
    $P1 = new [ 'HLLCompiler' ]

$P1.'init'('language'=>'punie', 'parse_grammar'=>'Punie::Parser','ast_grammar'=>'Punie::AST::Grammar')

.end
.sub 'main' :main
    .param pmc args
    $P0 = compreg 'punie'
    $P1 = $P0.'command_line'(args)
    .return ($P1)
.end

That's a great selling point to new compiler writers. (And I'd be evenhappier if we could export the 'main' routine from HLLCompiler insteadof cut-n-pasting it.)

I also think that many compilers may end up with compiler-specific
option flags or other items that need to be taken care of, and it
seems to me that this is more easily handled by a method definition
than a module specification.

Some will, but subclassing Compiler::Builder is a familiar andstraightforward process, and will give them all the flexibility theyneed to customize its behavior, not just the 'compile' routine. Optimizefor the common case, be flexible enough for the complex case.

(If the parser grammar module was specified inHLLCompiler's 'init', then the compiler object would know where to lookfor the optable.)
I'm thinking this is really a parameter to the AST compiler...

It's infrastructure code. Any stage of compilation may need access tothe optable, so the information on where to find it belongs in themeta-object that is governing all the compilation stages. (Generatingthe optable I'll leave for a different thread.)

- In HLLCompiler, split the 'compile' method out into independentmethods for each compilation stage ('compile_ast', 'compile_ost','compile_pir', etc.), all called from 'compile'.
Again, I tend to think of this as being all separate compilers,
each of which automatically call its default next stage until
compiler options tell it to do otherwise.


Standardized infrastructure code good. Make Ogg-itect happy. :)

Once we have a standardized infrastructure, it opens up lots ofpossibilities. Like, how about a subclass of Compiler::Builder thataccumulates statistics about the time spent on each stage of compilationand reports it at the end of the compile? Or language smoke-testingreports on the website broken down by compile stage? ("This test wassuccessful through the POST stage, but this one never made it throughthe parse.")


Allison

Re: Initial feedback on PAST-pm, or Partridge

Reply via email to