Overall, the POST implementation is usable and I really like the new HLL
compiler module. I've got Punie working with the new toolchain to the
point that it's generating valid PIR code for many low-level constructs,
but some of the high-level constructs that worked under the previous
toolchain I still don't have working. I've done everything I can do with
straightforward translations of the existing code, and am now to the
point where I'll have to do major conceptual refactors to fit with the
new toolchain.
I've already accumulated a good quantity of feedback for Patrick, so I
figured I'd go ahead and send it out now. (Especially since some of my
comments may result in changes that will make it much easier to finish
porting Punie.)
I had to poke into the guts of HLLCompiler, the new PAST, and the new
POST a fair bit in the process of getting Punie to work with them, so my
comments here are a mixture of user experience and implementation
details. I've grouped my comments into general categories.
------
Available node types:
- There's no PAST::Stmt node type? I only see PAST::Stmts and PAST::Op.
But statements are composed of multiple ops. So, everything is an op? I
was using PAST::Stmt and PAST::Exp for a similar purpose to what
POST::Ops performs. I've hacked it to use PAST::Stmts for this purpose,
but it doesn't quite work.
- There's no PAST::Label node type? How do you represent labels in the
HLL source?
- Is there no way to indicate what type of variable a PAST::Var is?
Scalar/Array/Hash? (high-level types, not low-level types)
---
Meaningful naming: (Be kind to your compiler writers.)
- In the PAST nodes, I grok 'name' as the operator/function name of a
PAST::Op and as the HLL variable name of a PAST::Var, but making it the
value of a PAST::Val is going to far. It was 'value' in the old PAST,
which makes more sense. You're passing named parameters into 'init', so
I can't see a reason not to use a more meaningful name for the attribute.
- In PAST nodes, the attribute 'ctype' isn't actually storing a C
language type. Better name?
- The attribute 'vtype' is both variable type in POST::Var and value
type in POST::Val. Handy generalization, but it's not clear from the
name that 'vtype' is either of those things.
- The values for both 'ctype' and 'vtype' are obscure. Better to
establish a general system for representing types, than to include raw
Parrot types or 1-letter codes in the AST.
- In PAST nodes, consider the audience when choosing attribute names
like 'ismy' (PAST::Var). Something like 'islexical' or 'isdeclaration'
(I'm not sure which you mean), is friendlier to non-Perl users, and
actually clearer even for Perl users.
- In PAST nodes again, I'm not clear on what 'pirop' (PAST::Op)
represents. Is it the literal name of a PIR opcode, or a generic
representation of standard low-level operations? I'm more in favor of
the latter. Better still, give compiler-writers a standard format lookup
table they can write to allow the PAST to POST tranformation to select
the right PIR operation from the HLL op name. (See the comments on
boundaries of abstraction.)
- In PAST nodes, the 'clone' method is now 'init'.'clone' was a terrible
name, I agree, but 'init' isn't quite right either.
- In PAST nodes, the 'add_child' method is now 'push'. I liked
'add_child' better, but, maybe what we really want is not a method at
all, but a :vtable entry for an array push? Seems likely, since there's
really not any other array-like behavior the syntax-tree nodes need to have.
- On module naming, I quickly regretted the naming of past2post.tg and
past2post_gen.pir (and all the related names) and changed them to
POSTGrammar.tg, POSTGrammar.pir, etc. in Punie. The .tg files are
modules, they're just modules written in a different language, so we
should standardize on module-style naming. Consider names like
POST/Grammar.tg and POST/Grammar.pir, or Partridge/Compiler/AST.tg and
Partridge/Compiler/AST.pir (looking at it from the perspective of the
compilation source rather than the compilation result).
---
Clear boundaries between components: (Fuzzy boundaries of abstraction
make it difficult to allow for other implementations of the AST/OST or
customization of the compiler object.)
- The 'compile' method doesn't belong in the PAST object, it belongs in
HLLCompiler.
- The 'compile' method also doesn't belong in the main compiler
executable, it belongs in HLLCompiler.
- Merge them into one 'compile' method in HLLCompiler.
- Provide an 'init' method for HLLCompiler that lets the compiler writer
set which modules HLLCompiler will use for each stage of compilation.
This will cover the majority of compilers without requiring each
compiler writer to define their own 'compile' routine.
- Customization of HLLCompiler should be handled by creating a subclass
of HLLCompiler. (The current 'register' strategy is somewhat fragile.)
- It would be easier to maintain (and create) the list of HLL to PIR
operator associations in something like a YAML file than embedded in the
parser grammar file. The associations aren't needed until the
PAST-to-POST transformation stage, leaving us with an AST free of
Parrot-isms so, a) introspection into the AST keeps the user purely on
the HLL level, rather than immediately plunging them into Parrot
internals, and b) the AST could be passed along to some other compiler
backend (like Pugs).
At the very least, the 'pirop' property on parser rules could be handled
by the PAST-to-POST transformation, so the compiler writer doesn't have
to manually pull those values out of the parser grammar's optable when
creating the AST. (If the parser grammar module was specified in
HLLCompiler's 'init', then the compiler object would know where to look
for the optable.)
---
Refactor into smaller units: (Easier to test, easier to maintain, easier
to debug, and easier to create subclasses.)
- In HLLCompiler, instead of a monolithic 'command_line' method, split
the components into independent methods all called from the
'command_line' method. Good candidates are the file-reading and
source-preparation code, the interactive interpretation code, and the
code to compile a source file. (Comment: avoid 'bsr' and 'ret' anywhere
and everywhere you can avoid them.)
- In HLLCompiler, split the 'compile' method out into independent
methods for each compilation stage ('compile_ast', 'compile_ost',
'compile_pir', etc.), all called from 'compile'.
---
Useful error detection: (Be kind to your compiler writers.)
- Returning the source code string from the 'compile' method in
HLLCompiler when no compiler is registered isn't helpful. Throw an
exception or give an error. Give the compiler writer a clue to what's wrong.
- Provide distinct errors/exceptions for failures at each stage of
compilation to make it easy to figure out which stage is failing.
---
Side comments:
- In PGE grammars, what is the "{ ... }" at the end of every proto
declaration supposed to do? It seems like a dead weight. Is that where
the Perl 6 code defining the operator supposed to go? Can we allow
'proto' declarations to end in a semi-colon when there is no Perl 6 code?
------
Allison