Overall, the POST implementation is usable and I really like the new HLL compiler module. I've got Punie working with the new toolchain to the point that it's generating valid PIR code for many low-level constructs, but some of the high-level constructs that worked under the previous toolchain I still don't have working. I've done everything I can do with straightforward translations of the existing code, and am now to the point where I'll have to do major conceptual refactors to fit with the new toolchain.

I've already accumulated a good quantity of feedback for Patrick, so I figured I'd go ahead and send it out now. (Especially since some of my comments may result in changes that will make it much easier to finish porting Punie.)

I had to poke into the guts of HLLCompiler, the new PAST, and the new POST a fair bit in the process of getting Punie to work with them, so my comments here are a mixture of user experience and implementation details. I've grouped my comments into general categories.

------

Available node types:

- There's no PAST::Stmt node type? I only see PAST::Stmts and PAST::Op. But statements are composed of multiple ops. So, everything is an op? I was using PAST::Stmt and PAST::Exp for a similar purpose to what POST::Ops performs. I've hacked it to use PAST::Stmts for this purpose, but it doesn't quite work.

- There's no PAST::Label node type? How do you represent labels in the HLL source?

- Is there no way to indicate what type of variable a PAST::Var is? Scalar/Array/Hash? (high-level types, not low-level types)

---

Meaningful naming: (Be kind to your compiler writers.)

- In the PAST nodes, I grok 'name' as the operator/function name of a PAST::Op and as the HLL variable name of a PAST::Var, but making it the value of a PAST::Val is going to far. It was 'value' in the old PAST, which makes more sense. You're passing named parameters into 'init', so I can't see a reason not to use a more meaningful name for the attribute.

- In PAST nodes, the attribute 'ctype' isn't actually storing a C language type. Better name?

- The attribute 'vtype' is both variable type in POST::Var and value type in POST::Val. Handy generalization, but it's not clear from the name that 'vtype' is either of those things.

- The values for both 'ctype' and 'vtype' are obscure. Better to establish a general system for representing types, than to include raw Parrot types or 1-letter codes in the AST.

- In PAST nodes, consider the audience when choosing attribute names like 'ismy' (PAST::Var). Something like 'islexical' or 'isdeclaration' (I'm not sure which you mean), is friendlier to non-Perl users, and actually clearer even for Perl users.

- In PAST nodes again, I'm not clear on what 'pirop' (PAST::Op) represents. Is it the literal name of a PIR opcode, or a generic representation of standard low-level operations? I'm more in favor of the latter. Better still, give compiler-writers a standard format lookup table they can write to allow the PAST to POST tranformation to select the right PIR operation from the HLL op name. (See the comments on boundaries of abstraction.)

- In PAST nodes, the 'clone' method is now 'init'.'clone' was a terrible name, I agree, but 'init' isn't quite right either.

- In PAST nodes, the 'add_child' method is now 'push'. I liked 'add_child' better, but, maybe what we really want is not a method at all, but a :vtable entry for an array push? Seems likely, since there's really not any other array-like behavior the syntax-tree nodes need to have.

- On module naming, I quickly regretted the naming of past2post.tg and past2post_gen.pir (and all the related names) and changed them to POSTGrammar.tg, POSTGrammar.pir, etc. in Punie. The .tg files are modules, they're just modules written in a different language, so we should standardize on module-style naming. Consider names like POST/Grammar.tg and POST/Grammar.pir, or Partridge/Compiler/AST.tg and Partridge/Compiler/AST.pir (looking at it from the perspective of the compilation source rather than the compilation result).

---

Clear boundaries between components: (Fuzzy boundaries of abstraction make it difficult to allow for other implementations of the AST/OST or customization of the compiler object.)

- The 'compile' method doesn't belong in the PAST object, it belongs in HLLCompiler.

- The 'compile' method also doesn't belong in the main compiler executable, it belongs in HLLCompiler.

- Merge them into one 'compile' method in HLLCompiler.

- Provide an 'init' method for HLLCompiler that lets the compiler writer set which modules HLLCompiler will use for each stage of compilation. This will cover the majority of compilers without requiring each compiler writer to define their own 'compile' routine.

- Customization of HLLCompiler should be handled by creating a subclass of HLLCompiler. (The current 'register' strategy is somewhat fragile.)

- It would be easier to maintain (and create) the list of HLL to PIR operator associations in something like a YAML file than embedded in the parser grammar file. The associations aren't needed until the PAST-to-POST transformation stage, leaving us with an AST free of Parrot-isms so, a) introspection into the AST keeps the user purely on the HLL level, rather than immediately plunging them into Parrot internals, and b) the AST could be passed along to some other compiler backend (like Pugs).

At the very least, the 'pirop' property on parser rules could be handled by the PAST-to-POST transformation, so the compiler writer doesn't have to manually pull those values out of the parser grammar's optable when creating the AST. (If the parser grammar module was specified in HLLCompiler's 'init', then the compiler object would know where to look for the optable.)

---

Refactor into smaller units: (Easier to test, easier to maintain, easier to debug, and easier to create subclasses.)

- In HLLCompiler, instead of a monolithic 'command_line' method, split the components into independent methods all called from the 'command_line' method. Good candidates are the file-reading and source-preparation code, the interactive interpretation code, and the code to compile a source file. (Comment: avoid 'bsr' and 'ret' anywhere and everywhere you can avoid them.)

- In HLLCompiler, split the 'compile' method out into independent methods for each compilation stage ('compile_ast', 'compile_ost', 'compile_pir', etc.), all called from 'compile'.

---

Useful error detection: (Be kind to your compiler writers.)

- Returning the source code string from the 'compile' method in HLLCompiler when no compiler is registered isn't helpful. Throw an exception or give an error. Give the compiler writer a clue to what's wrong.

- Provide distinct errors/exceptions for failures at each stage of compilation to make it easy to figure out which stage is failing.

---

Side comments:

- In PGE grammars, what is the "{ ... }" at the end of every proto declaration supposed to do? It seems like a dead weight. Is that where the Perl 6 code defining the operator supposed to go? Can we allow 'proto' declarations to end in a semi-colon when there is no Perl 6 code?

------

Allison

Reply via email to