Adam Kennedy wrote:

Frankly, as the only person who has managed to get together a "guessing lexer" that is sufficiently accurate to be something other than useless,

Hmmmmm. I must confess that I don't consider Text::Balanced's C<extract_codeblock> subroutine to be entirely useless. And presumably neither do the thousands of people who use it every day to parse all kinds of Perl code inside their Parse::RecDescent grammars, and in many other embedded code applications as well.


So maybe you're not the *only* person to have built a useful "guessing lexer"? Though I'm happy to concede that yours might well be the best.


I see nothing in Perl 6 that makes it any easier than Perl 5.

What? Not even the runtime-accessible Perl parsing grammar embedded directly in the Perl 6 language itself?


When the language can accurately parse itself, a "guessing lexer" would seem no longer to be necessary. And even if it is still desirable, it will be vastly easier to construct in Perl 6, by deriving a simplified parser from the standard Perl grammar.


As a side note, I may have used "parse" erroneously. What PPI attempts to do is to be a tokenizer and a lexer, without understanding its function as code.

So perhaps a "syntax lexer" is a closer term. The ability to read in and work with code based purely on syntax, without needing to know what it means.
...is simply not possible with Perl. Any more than it's possible to create a tokenizer and lexer for English text, without understanding that text's function as language.

Perl is sufficiently complex that you just cannot lex it without the contextual information provided by parsing it (and sometime evaluating it). We're not going to "fix" that...because it's not broken. It's an enormously powerful *feature*.

Unfortunately it's a feature that, by its very nature, precludes the feature that you want (i.e. simple static lexing). I'm sorry about that.

But not *very* sorry, because whenever you optimize for static lexability you end up with a language with all the nuanced expressive power and syntactic flexibility of, say, Pascal or Java. :-(

Damian

Reply via email to