Re: The external interface for the parser piece

Dan Sugalski Tue, 28 Nov 2000 10:35:22 -0800
At 09:48 AM 11/28/00 -0800, Steve Fink wrote:
>Dan Sugalski wrote:
> >
> >    int perl6_parse(PerlInterp *interp,
> >                    void *source,
> >                    int flags,
> >                    void *extra_pointer);
>
>Given that other things may want to be streamable in similar fashion (eg
>the regular expression engine), why not have a PerlDataSource union or
>somesuch that encapsulates all of the possibilities of the final three
>arguments? Or put all possibilities into a PerlIO*? That gives direct
>support for compressed source, source streamed over a network socket,
>etc., with a more common framework than PERL_GENERATED_SOURCE.

Embedding is the big reason. This interface should be simple for embedding 
programs, most of which will either pass in a C filehandle or a plain char* 
with source in it. That's why there's no fancy structures or anything that 
go in. (Well, besides the perlinterp structure, but that's pretty much a 
magic cookie as far as programs are concerned)

>Things like PERL_CHAR_SOURCE meaning nul-terminated char* sound
>unnecessarily specific.

Well, it is the most common type of string that perl's going to see, which 
is why it's in there. UTF-8's the next most likely one, hence the flag.

>Also, you gave two options: nul-terminated and length-first. What about
>a "chunked" encoding, where you get multiple length-first chunks of the
>input (as in HTTP/1.1's Transfer-Encoding: chunked, for one example of
>many)? Or are nuls explicitly forbidden in source code?

Nulls aren't explicitly forbidden, but they're real inconvenient in C-style 
strings, hence the length option. (Plus we might be able to do Clever 
Things if we know the length) I'm not sure how UTF-8 jammed into C strings 
works either, since IIRC there can be null bytes in a UTF-8 data stream.

Nulls are OK in the source on disk, though they're still annoying inside a 
C program. (Like, say, perl... :)

>And, in a related question, the above interface appears that you call
>perl6_parse once. Will this be good enough, or do you want to have a
>PerlParseState* in/out parameter that allows restarting a parse once you
>get more of the input available? (With this, you don't need an explicit
>chunked encoding, since the caller can deal with that without being
>required to buffer the whole thing in memory before calling
>perl6_parse.) Or would that go into the PerlInterp too?

What I was thinking, but didn't say, is that for the PERL_GENERATED_SOURCE 
case we'd just call the function provided over and over until it returns 
NULL, at which point we assume it's all done. So for the chunked text case, 
each call to the function would return a chunk, and the function would 
return NULL when it's run out of chunks.

>And finally, how do I get the output out of the PerlInterp? Is it stored
>under some variable name, or does the PerlInterp start out empty and
>gains the parsed syntax tree as its only syntax tree, or ? (The latter
>sounds messy if the PerlInterp is also running code, code that wants to
>call some standard utility functions implemented in Perl.) Maybe I'm not
>making sense.

It's stored in the PerlInterp structure. Where I don't know, but that can 
be put off for later.

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk
Re: The external interface for the parser piece

Reply via email to