Re: [PHP-DEV] Re: Moving to an AST-based parsing/compilation process

Anthony Ferrara Thu, 06 Sep 2012 05:02:02 -0700

Stas,

On Thu, Sep 6, 2012 at 5:25 AM, Stas Malyshev <smalys...@sugarcrm.com>wrote:


> Hi!
>
> > Well, apart from perhaps leaving them with a simpler language that
> > doesn't have the inconsistencies and corner cases that currently exist
> > (and documented ad nauseum) not because of any design decision but
> > "because the parser is written that way".
>
> If you think writing new parser gets rid of all corner cases you are in
> for a big surprise. AST is not magic and parser will always be written
> exactly the way it is written - so if somebody won't implement certain
> feature in a consistent way, it won't be implemented in consistent way,
> AST or not.
>

Actually, that's not true. Right now, the parser is parsing both syntax and
a good bit of grammar. That's why we have so many reserved words. The
compiler step implements some of the grammar, but the parser takes care of
a significant amount of it.

With a move to an AST based parsing, the parser can be greatly simplified,
with a very significant reduction in reserved words. This has a few
benefits:

1. Reduced number of first-class tokens makes parsing the syntax
potentially much more efficient. This is at the expense of a more
complicated compiling step (building and processing the AST).

2. It also removes the need for the parser to worry about precedence. It's
parsing for syntax only, and then lets the AST compiler step worry about
operator precedence...

3. It provides the ability for the grammar to be extended without modifying
the syntax. That means that PECL extensions could theoretically add
compiler steps to not only extend functionality, but grammar as well. For
example, it may be possible to add language rules (such as an inline
keyword for functions, or pre-processor macros) that allow for extension of
the language without modifying the parser (I say may, because it depends
strongly on the design of the parser and AST).

4. Since the parser doesn't directly make opcodes, it would mean that
syntax errors (parse errors) would be able to be 100% recoverable. Compiler
errors would be just as difficult to recover from though.

5. It opens the door to leveraging 3pd systems. For example, the Zend VM
could hypothetically be replaced by a LLVM based VM. That would allow for
JIT based php code. Note that this isn't HipHop (which is a limited subset
of PHP), but full PHP running on a JIT VM. This could be implemented as a
PECL extension, utilizing the core parser and runtime environment, just
swapping out the executor step... Obviously this would not be trivial to
build, but right now if you wanted to build it you'd need to fork PHP to do
it (hence why the existing compilers for PHP all use a different parser).

And it's a bit late to take design decisions on existing PHP language,
> it seems to me.
>

It will never be easier to do than today. As time goes on, the language
will continue to grow, and the syntax and grammar will only get more
complicated from here out. So the easiest time to do it will be now...

Anthony

Re: [PHP-DEV] Re: Moving to an AST-based parsing/compilation process

Reply via email to