At 12:29 AM 11/25/00 -0500, Sam Tregar wrote:
>On Fri, 24 Nov 2000, Nicholas Clark wrote:
>
> > I think Dan was suggesting that the (user side) regex doesn't change at all
> > (so that's no new syntax there)
> > It's just that the innards of perl gains a tied scalar that doesn't 
> actually
> > read in and buffer the file immediately, but defers it as long as it 
> can get
> > away with. And that the regex engine knows about these lazy scalars and
> > provokes the read-more when needed.
>
>Right.  And I was suggesting that while this might solve our problem it
>wouldn't do much for all the other people that have to solve the same
>problem.  I'd like to see a general solution accessible from Perl.  If
>that solution is some tied-scalar magic, fine.  If it's more involved than
>that (and I think it will be) then we'll need to think about the syntax a
>bit.

If it solves our problem, that's just fine. And since the parser will be 
mostly in perl, it means that this particular solution will be available to 
everyone.

    my file_tie $file_data : name "source.pl";

or something like that.

> > I don't think that this differs from the current parser. If it encounters
> > open " but never a close ", it will read and buffer to the end of file
> > before realising that there's a problem. (because strictly there isn't
> > a problem until EOF is encountered before the closing ")
> >
> > I'm not certain there's anything that can actually be done to avert the 
> need
> > to buffer a lot of script in these situations. You mustn't attempt to seek
> > the script file handle as it might be from something unseekable such as a
> > pipe (or socket. BEGIN {socket STDIN...})
>
>I suppose that's true.  I was immagining something less extreme than the
>absolute failure case of missing a closing ".  I'm imagining a failure
>that is recoverable but still requires running the regex to the end of the
>"string" to find that out.  Are there any like this?  Perhaps not.

I don't think there is, really. You might be able to recover from the loss 
of a closing parenthesis if you can assume that the statement close would 
close up a paren. I'm thinking something like:

    $foo = bar(12;

where perl could guess that the missing paren's just before the semi-colon. 
Anything past that would probably be really dodgy--if there were multiple 
parens there, where would you put the missing one?

>Perhaps this just isn't a reasonable criticism of regex parsers since
>normal parsers do it all the time anyway!

It's certainly a reasonable criticism of parsers in general, and a good one 
to keep in mind with regex based parsers. It's easier to overdo it with a 
bad regex than it is to crock your character-by-character hardcoded state 
machine.

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to