On Wed, 22 Nov 2000, Dan Sugalski wrote:

> Probably the easiest thing is to implement some sort of file-tied scalar or
> something that can provide bytes to the regex engine until it stops asking
> for them. Some magic or other, though, will get us what we need.

That might be the easiest thing for us - as internals programmers - but
does it answer the general need?  Everyone writing regex-based parsers
faces this problem.  Maybe this is something to toss to perl6-language and
get some RFC'd Larry-fried syntax?

Also, a nagging question - how does a regex-based parser work without
ending up reading the entire file into memory most of the time?  Even with
an intelligent tied-scalar reading bytes there's going to be failing cases
where the regex has to walk to the end of the "string" to find out it
failed.  Presumably it would also need to seek back to the start which
means we'd have to buffer as we go.

Perhaps we really need a new kind of regex that works by-design against
streams of bytes?

-sam


Reply via email to