Initial feedback on PAST-pm, or Partridge

Allison Randal Sun, 26 Nov 2006 20:30:55 -0800

Overall, the POST implementation is usable and I really like the new HLLcompiler module. I've got Punie working with the new toolchain to thepoint that it's generating valid PIR code for many low-level constructs,but some of the high-level constructs that worked under the previoustoolchain I still don't have working. I've done everything I can do withstraightforward translations of the existing code, and am now to thepoint where I'll have to do major conceptual refactors to fit with thenew toolchain.

I've already accumulated a good quantity of feedback for Patrick, so Ifigured I'd go ahead and send it out now. (Especially since some of mycomments may result in changes that will make it much easier to finishporting Punie.)

I had to poke into the guts of HLLCompiler, the new PAST, and the newPOST a fair bit in the process of getting Punie to work with them, so mycomments here are a mixture of user experience and implementationdetails. I've grouped my comments into general categories.


------

Available node types:

- There's no PAST::Stmt node type? I only see PAST::Stmts and PAST::Op.But statements are composed of multiple ops. So, everything is an op? Iwas using PAST::Stmt and PAST::Exp for a similar purpose to whatPOST::Ops performs. I've hacked it to use PAST::Stmts for this purpose,but it doesn't quite work.

- There's no PAST::Label node type? How do you represent labels in theHLL source?

- Is there no way to indicate what type of variable a PAST::Var is?Scalar/Array/Hash? (high-level types, not low-level types)


---

Meaningful naming: (Be kind to your compiler writers.)

- In the PAST nodes, I grok 'name' as the operator/function name of aPAST::Op and as the HLL variable name of a PAST::Var, but making it thevalue of a PAST::Val is going to far. It was 'value' in the old PAST,which makes more sense. You're passing named parameters into 'init', soI can't see a reason not to use a more meaningful name for the attribute.

- In PAST nodes, the attribute 'ctype' isn't actually storing a Clanguage type. Better name?

- The attribute 'vtype' is both variable type in POST::Var and valuetype in POST::Val. Handy generalization, but it's not clear from thename that 'vtype' is either of those things.

- The values for both 'ctype' and 'vtype' are obscure. Better toestablish a general system for representing types, than to include rawParrot types or 1-letter codes in the AST.

- In PAST nodes, consider the audience when choosing attribute nameslike 'ismy' (PAST::Var). Something like 'islexical' or 'isdeclaration'(I'm not sure which you mean), is friendlier to non-Perl users, andactually clearer even for Perl users.

- In PAST nodes again, I'm not clear on what 'pirop' (PAST::Op)represents. Is it the literal name of a PIR opcode, or a genericrepresentation of standard low-level operations? I'm more in favor ofthe latter. Better still, give compiler-writers a standard format lookuptable they can write to allow the PAST to POST tranformation to selectthe right PIR operation from the HLL op name. (See the comments onboundaries of abstraction.)

- In PAST nodes, the 'clone' method is now 'init'.'clone' was a terriblename, I agree, but 'init' isn't quite right either.

- In PAST nodes, the 'add_child' method is now 'push'. I liked'add_child' better, but, maybe what we really want is not a method atall, but a :vtable entry for an array push? Seems likely, since there'sreally not any other array-like behavior the syntax-tree nodes need to have.

- On module naming, I quickly regretted the naming of past2post.tg andpast2post_gen.pir (and all the related names) and changed them toPOSTGrammar.tg, POSTGrammar.pir, etc. in Punie. The .tg files aremodules, they're just modules written in a different language, so weshould standardize on module-style naming. Consider names likePOST/Grammar.tg and POST/Grammar.pir, or Partridge/Compiler/AST.tg andPartridge/Compiler/AST.pir (looking at it from the perspective of thecompilation source rather than the compilation result).

---

Clear boundaries between components: (Fuzzy boundaries of abstractionmake it difficult to allow for other implementations of the AST/OST orcustomization of the compiler object.)

- The 'compile' method doesn't belong in the PAST object, it belongs inHLLCompiler.

- The 'compile' method also doesn't belong in the main compilerexecutable, it belongs in HLLCompiler.


- Merge them into one 'compile' method in HLLCompiler.

- Provide an 'init' method for HLLCompiler that lets the compiler writerset which modules HLLCompiler will use for each stage of compilation.This will cover the majority of compilers without requiring eachcompiler writer to define their own 'compile' routine.

- Customization of HLLCompiler should be handled by creating a subclassof HLLCompiler. (The current 'register' strategy is somewhat fragile.)

- It would be easier to maintain (and create) the list of HLL to PIRoperator associations in something like a YAML file than embedded in theparser grammar file. The associations aren't needed until thePAST-to-POST transformation stage, leaving us with an AST free ofParrot-isms so, a) introspection into the AST keeps the user purely onthe HLL level, rather than immediately plunging them into Parrotinternals, and b) the AST could be passed along to some other compilerbackend (like Pugs).

At the very least, the 'pirop' property on parser rules could be handledby the PAST-to-POST transformation, so the compiler writer doesn't haveto manually pull those values out of the parser grammar's optable whencreating the AST. (If the parser grammar module was specified inHLLCompiler's 'init', then the compiler object would know where to lookfor the optable.)

---

Refactor into smaller units: (Easier to test, easier to maintain, easierto debug, and easier to create subclasses.)

- In HLLCompiler, instead of a monolithic 'command_line' method, splitthe components into independent methods all called from the'command_line' method. Good candidates are the file-reading andsource-preparation code, the interactive interpretation code, and thecode to compile a source file. (Comment: avoid 'bsr' and 'ret' anywhereand everywhere you can avoid them.)

- In HLLCompiler, split the 'compile' method out into independentmethods for each compilation stage ('compile_ast', 'compile_ost','compile_pir', etc.), all called from 'compile'.


---

Useful error detection: (Be kind to your compiler writers.)

- Returning the source code string from the 'compile' method inHLLCompiler when no compiler is registered isn't helpful. Throw anexception or give an error. Give the compiler writer a clue to what's wrong.

- Provide distinct errors/exceptions for failures at each stage ofcompilation to make it easy to figure out which stage is failing.


---

Side comments:

- In PGE grammars, what is the "{ ... }" at the end of every protodeclaration supposed to do? It seems like a dead weight. Is that wherethe Perl 6 code defining the operator supposed to go? Can we allow'proto' declarations to end in a semi-colon when there is no Perl 6 code?


------

Allison

Initial feedback on PAST-pm, or Partridge

Reply via email to