On Jun 27, Luke Palmer said: >Jeff 'japhy' Pinyan writes: >> I am currently completing work on an extensible regex-specific parsing >> module, Regexp::Parser. It should appear on CPAN by early July >> (hopefully under my *new* CPAN ID "JAPHY"). >> >> Once it is completed, I will be starting work on writing a subclass >> that matches Perl 6 regexes, Regexp::Perl6 (or Perl6::Regexp, or >> Perl6::Regexp::Parser). > >Or Regexp::Parser::Perl6 :-)
I wasn't sure where in the hierarchy of modules it should go. >The grammar for Perl 6 is going to be specified with Perl 6 patterns. >That presents us a little bootstrapping problem. So the original goal >of Damian's Perl6::Rules was to transform this grammar back into Perl 5 >patterns so they can parse the simplified Perl 6 code for Perl 6 and >compile a bootstrap. > >My personal, nondivine plan would be to use your module to create a >driver-based parser. That could then be used for the bootstrap instead. If you mean what I think you mean by driver-based, than my module is a perfect fit. To subclass it, you do this: package Regexp::NoCode; use base 'Regexp::Parser'; sub init { my $self = shift; $self->SUPER::init(); $self->del_handler('(?{'); $self->del_handler('(??{'); $self->del_handler('(?p{'); } 1; Now you have a parser that refuses to acknowledge (?{ ... }) and (??{ ... }) assertions (resulting in an error). Another example would be to support the '&' metacharacter, which is the "AND" equivalent of '|': package Regexp::AndBranch; use base 'Regexp::Parser'; sub init { my $self = shift; $self->SUPER::init(); $self->add_handler('&' => sub { my ($S) = @_; return $S->object('and'); }); } Then you create Regexp::AndBranch::and like so: package Regexp::AndBranch::and; @ISA = Regexp::Parser::or; # it behaves like an 'or' branch... sub new { my ($class, $rx, $lhs, $rhs) = @_; my $self = bless { rx => $rx, flags => $rx->{flags}[-1], class => 'branch', type => 'and', data => [$lhs, $rhs], raw => '&', }, $class; return $self; } It'll inherit the other methods it needs from the 'or' class. Then, when you want to convert it to an existing construct (specifically, /x&y&z/ would become /(?=x)(?=y)z/, like in vim). >A driver-based parser has a couple of advantages over regexes and even >Parse::RecDescent. First, the parsing algorithm can be easily >customized, so we can play with hybrid models and see how the time >complexity works out. Also, you can suspend the parsing in the middle >of execution, go somewhere else, and continue, which the Perl 6 parser >might just want to do (something like simulated coroutines). Yeah, you can parse node-by-node: my $p = Regexp::Parser->new($regex); while (my $n = $p->next) { # ... } When I finish writing it to work with the current set, I'll post it to CPAN and alert the group. -- Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/ RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/ CPAN ID: PINYAN [Need a programmer? If you like my work, let me know.] <stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.