On Wed, Oct 24, 2001 at 09:06:14AM -0400, Aaron Sherman wrote: > On Tue, Oct 23, 2001 at 02:53:19PM +0200, Nadim Khemir wrote: > > > > Don't we already have that in Perl 5? > > > > > > if ( /\G\s+/gc ) { # whitespaces } > > > elsif ( /\G[*/+-]/gc ) { # operator } > > > elsif ( /\G\d+/gc ) { # term } > > > elsif ( /\G.+/gc ) { # unrecognized token } > > > > > > Tad McClellan > > > > The answer is NO, regexes and a lexer are totally different. I would > > recommend Tad to study a bit more what parsing is before thinking it's jut > > about writing regexes. Having a lexer allows perl do some kind of text > > processing (raw lexing and parsing) at a much faster. If it is of some > > interest I could benchmark a simple example. > > So, aren't you saying, "yes, but it would be slow"? I can't think of > anything a lexer is capable of that I can't (and probably haven't) done > in Perl with relative ease. > > Now, if you want a PARSER, that's a different matter, but a simple > lexical scanner is trivial to write in Perl with logic and regular > expressions. > > In terms of speed, this is particularly ideal because you can identify > what parts of your Perl code slow the lexer down, and re-code those > using C/XS. The best of all 2,384 worlds... that's Perl!
I have always found that the perl output from byacc (with a few tweaks) generates a sufficient parser. The addition of a switch statement will hopefully make it more efficient. For a lexer I try to use a single regex with /g, but that does require the text being parsed to be all in a single scalar. Although that could be worked around if needed. For an example, take a look at Convert::ASN1 Graham.