Stas, On Thu, Sep 6, 2012 at 5:25 AM, Stas Malyshev <smalys...@sugarcrm.com>wrote:
> Hi! > > > Well, apart from perhaps leaving them with a simpler language that > > doesn't have the inconsistencies and corner cases that currently exist > > (and documented ad nauseum) not because of any design decision but > > "because the parser is written that way". > > If you think writing new parser gets rid of all corner cases you are in > for a big surprise. AST is not magic and parser will always be written > exactly the way it is written - so if somebody won't implement certain > feature in a consistent way, it won't be implemented in consistent way, > AST or not. > Actually, that's not true. Right now, the parser is parsing both syntax and a good bit of grammar. That's why we have so many reserved words. The compiler step implements some of the grammar, but the parser takes care of a significant amount of it. With a move to an AST based parsing, the parser can be greatly simplified, with a very significant reduction in reserved words. This has a few benefits: 1. Reduced number of first-class tokens makes parsing the syntax potentially much more efficient. This is at the expense of a more complicated compiling step (building and processing the AST). 2. It also removes the need for the parser to worry about precedence. It's parsing for syntax only, and then lets the AST compiler step worry about operator precedence... 3. It provides the ability for the grammar to be extended without modifying the syntax. That means that PECL extensions could theoretically add compiler steps to not only extend functionality, but grammar as well. For example, it may be possible to add language rules (such as an inline keyword for functions, or pre-processor macros) that allow for extension of the language without modifying the parser (I say may, because it depends strongly on the design of the parser and AST). 4. Since the parser doesn't directly make opcodes, it would mean that syntax errors (parse errors) would be able to be 100% recoverable. Compiler errors would be just as difficult to recover from though. 5. It opens the door to leveraging 3pd systems. For example, the Zend VM could hypothetically be replaced by a LLVM based VM. That would allow for JIT based php code. Note that this isn't HipHop (which is a limited subset of PHP), but full PHP running on a JIT VM. This could be implemented as a PECL extension, utilizing the core parser and runtime environment, just swapping out the executor step... Obviously this would not be trivial to build, but right now if you wanted to build it you'd need to fork PHP to do it (hence why the existing compilers for PHP all use a different parser). And it's a bit late to take design decisions on existing PHP language, > it seems to me. > It will never be easier to do than today. As time goes on, the language will continue to grow, and the syntax and grammar will only get more complicated from here out. So the easiest time to do it will be now... Anthony