Damian wrote:
>
> I once wrote a C++-based regex engine (much simpler than Perl's!)
> just like this.
>
> Knowing why a regex failed *is* invaluable when matching regexes
> against file streams, but there are more possibilities than you
> mentioned:
>
> "Failed" Did not match because of illegal transition
>
> "Short" Did not match: did not reach acceptor state
>
> "Exact" Matched and finished in an acceptor state
>
> "Long" Passed through an acceptor state, continued
to
> match, but did not finish in an acceptor state
>
> "LongFailed" Passed through an acceptor state, continued
to
> match, but then found an illegal transition
>
> Ultimately, I decided that what was needed wasn't insight into the cause
> of failure, but rather the chance to provide more data to "feed" the
> engine so it doesn't have to fail "Short" or "Long". That's why I
> proposed RFC 93 (http://dev.perl.org/rfc/93.html) instead of a mechanism
> such as you have suggested.
>
> Damian
>
Good points, Damian.
I read your RFC 93. It mentions using a sub to read from the string. I just
think it uses the sub in two conflicting ways, one for requesting more data
from the stream and other for telling there was a match. I thought, too,
that requesting it to return _exactly_ the number of characters that was
requested goes against most unix syscalls convention (like read...), where
it's requested to read at most that number of characters.
What I think is that it could be handled by a OO module. Suppose there's how
to hook into the regexp engine guts, getting responses as the ones you
mentioned above. One could write a OO module, with methods for reading more
data, checking end of data, and acknowledging a failed or succeeded match.
Then, it could overload the =~ operator, making the regexp engine call the
module's methods instead of its own's.
Then, what you proposed in RFC 93 through
sub { ... } =~ m/.../;
could be handled by
my $mymatch = MyClassForMatchingFromFileHandles->new($myhandle);
$mymatch =~ m/.../;
What I mean is, by exposing the guts of the regexp engine, we could
implement all that's wanted in RFC 93, with a cleaner interface, and even do
more, because we can hook up every call to the regexp engine!
BTW, if you have a C++-based regexp engine with a clean design, couldn't we
use it as a base to a new regexp engine that supports current (or new)
perl's regexp syntax and features and has its guts exposed?
Branden.
_________________________________________________________
Oi! Você quer um iG-mail gratuito?
Então clique aqui: http://www.ig.com.br/paginas/assineigmail.html