On Wed, Oct 24, 2001 at 09:06:14AM -0400, Aaron Sherman wrote:
> On Tue, Oct 23, 2001 at 02:53:19PM +0200, Nadim Khemir wrote:
>
> > > Don't we already have that in Perl 5?
> > >
> > > if ( /\G\s+/gc ) { # whitespaces }
> > > elsif ( /\G[*/+-]/gc ) { # operator }
> > > elsif ( /\G\d+/gc ) { # term }
> > > elsif ( /\G.+/gc ) { # unrecognized token }
> > >
> > > Tad McClellan
> >
> > The answer is NO, regexes and a lexer are totally different. I would
> > recommend Tad to study a bit more what parsing is before thinking it's jut
> > about writing regexes. Having a lexer allows perl do some kind of text
> > processing (raw lexing and parsing) at a much faster. If it is of some
> > interest I could benchmark a simple example.
>
> So, aren't you saying, "yes, but it would be slow"? I can't think of
> anything a lexer is capable of that I can't (and probably haven't) done
> in Perl with relative ease.
>
> Now, if you want a PARSER, that's a different matter, but a simple
> lexical scanner is trivial to write in Perl with logic and regular
> expressions.
>
> In terms of speed, this is particularly ideal because you can identify
> what parts of your Perl code slow the lexer down, and re-code those
> using C/XS. The best of all 2,384 worlds... that's Perl!
I have always found that the perl output from byacc (with a few tweaks)
generates a sufficient parser. The addition of a switch statement
will hopefully make it more efficient.
For a lexer I try to use a single regex with /g, but that does require the
text being parsed to be all in a single scalar. Although that could be
worked around if needed.
For an example, take a look at Convert::ASN1
Graham.