Re: [PHP-DEV][RFC][VOTING] Context Sensitive Lexer

Nikita Popov Wed, 04 Mar 2015 00:53:22 -0800

On Sun, Mar 1, 2015 at 2:11 AM, Marcio Almada <marcio.w...@gmail.com> wrote:


> Hi,
>
> Since no more issues appeared on discussion, the voting for the "Context
> Sensitive Lexer" is now open. The voting will close in exactly 14 days
> counting from now:
>
> RFC: https://wiki.php.net/rfc/context_sensitive_lexer#votes
>
> Since so few people participated on discussions, if you decide to vote "no"
> please share your motives on the mailing lists by replying to this thread.
> This will collaborate to the RFC process and can lead to improvements on
> possible follow up RFCs.
>
> Acknowledgments to Bob Weinand for all the advice and cooperation.
>
> Thanks,
> Márcio.


After reviewing the implementation, I've decided to vote "no" on this RFC.

I had originally assumed that if this proposal is limited to method names
and class constants only the implementation should be pretty simple and
robust. However it turned out that there are a number of complications, the
two primary ones being:

a) We allow to define multiple class constants in one declaration. I.e. you
can write not only

    const FOO = 42;

but also

    const FOO = 42, BAR = 24, BAZ = 0;

This is further complicated by the fact that we allow many types of (static
scalar) expressions to be used as values for constants, so a declaration
might actually look like

    const FOO = 42, BAR = [24, BAZ];

as well.

b) The trait adaptations syntax allows using method names without a generic
prefix like "::" or "->" in a number of ways. E.g. in

    use A { A::foo as bar; }

foo and bar are method names. On the other hand in

    use A { A::foo as public; }

the "public" is not a method name, it is used for trait method visibility
aliasing. Trait aliasing (but not precedence!) adaptations also allow
non-absolute method references on the left hand side, so

    use A { foo as bar; }

without the leading A:: is possible as well.

In order to support these kinds of things, the lexer has to track a bunch
of state (e.g. to know if we're currently in a trait adaptation list as
opposed to a namespace or closure "use" list) and also make extensive use
of lexer lookahead. We also have to accurately track nesting of braces,
which is easy to get wrong when considering details like string
interpolation syntax.

After a number of issues was fixed by Marcio, the patch still contains a
few lexer rules that I don't really understand, but which seem to be
required to avoid regressions in some edge case.

So, tl;dr: I think the patch is too risky. Even if we can make sure that
we've covered all the current edge-cases and don't regress anything, I'm
afraid that this will cause complications with future changes. This ends up
replicating too many parsing aspects in the lexer in a rather ad-hoc manner.

Nikita

Re: [PHP-DEV][RFC][VOTING] Context Sensitive Lexer

Reply via email to