Re: To get things started...

Dan Sugalski Tue, 21 Nov 2000 11:47:12 -0800
At 01:04 PM 11/21/00 +0000, David Grove wrote:

>Dan Sugalski <[EMAIL PROTECTED]> wrote:
>
>  > At 07:36 AM 11/21/00 -0500, David Grove wrote:
>  > >However, one thing is seriously lacking in this theory... if the
>parser is
>  > >perl, how does the perl parse? (Sort of a woodchuck chucking wood type
>of
>  > >thing.) Somehow, the external parser API thingy has to know enough
>perl
>  > >(through the chosen language) to be able to handle the parsing.
>  >
>  > Nope. We do it in two phases. The end result will not actually parse
>perl
>  > code to build the parser (we'll provide bytecode for that) but to start
>we
>  > can run the parser through perl 5 to get a syntax tree until the perl 6
>
>  > engine's capable of doing it itself.
>
>Hmmm, that sounds familiar...

Sure. Compilers have been doing it for decades.

>  > >To quote my perl elders, whatever can be done without regexen should
>be
>  > >done with index() (within limits, since some regexen can be quite
>  > >optimized).
>  >
>  > No, not really. regexes are generally easier to comprehend than their
>  > index
>  > couterparts, and often faster. (There's a lot of code that needs to go
>  > into
>  > backtracking...) While index might be better sometimes we can't force
>  > folks
>  > to use it. Almost all of perl is up for grabs.
>
>I won't argue the point as long as it works, the point being that we do it
>with whatever method is capable of the greatest efficiency.

As long as everyone understands that efficiency doesn't necessarily mean 
the code that executes the fastest. While I want the parser fast, it is 
generally a one-shot thing, and if it takes an extra millisecond or twelve 
that probably doesn't make much difference. Cutting a day or twelve off of 
the preliminary development time, though, does matter rather a lot more.

>  > >The parser API needs to know both regexen and index() in order to
>work.
>  >
>  > The parser will have a fully-functional interpreter to work with. All
>of
>  > perl will likely be there for it. (Modules and threads might not, but
>  > that's still up in the air)
>
>But that "interpreter" will be in the form of API, right?

No. The API is just a set of functions. I mean an iterpreter, a real entity 
that can do something. Pretty much the same as an interpreter instance in 
perl 5.

>  > >  > * The parser will have an active interpreter structure handy
>  > >
>  > >Is this the perl that parses the perl?
>  >
>  > Yup. In fact we might have two--the interpreter structure for the
>  > interpreter running the parser, and the structure for the end-result
>  > parsed
>  > program. Or we might just use one and squirrel all the interpreter bits
>in
>  > a private (and deletable) namespace somewhere.
>
>It's pretty clear that we're to purposely put in a distinct separation
>between the two, unless I misunderstood Larry on this.

You probably misunderstood a little. I don't think Larry really cares how 
it works as long as it does. If the parser leaves a lot of cruft in the 
_Parser namespace it likely matters not.

>I'm cautious about
>dual-purposing anything here, since he said that this is a major problem
>in Perl 5 today (the lack of flexibility between either end).
>
>I'd like to ask for a clarification of the following terms as they apply
>here:
>
>1. External API

The functions presented to the world at large, including other parts of 
perl. (The bytecode compiler, the optimizer, and the interpreter, specifically)

>2. Internal API

The functions, hooks, and spots for hooks presented to the code inside parser.

>3. Parser

The piece of perl that takes a stream of source and emits a syntax tree.

>4. Interpreter

The piece of perl that takes a chunk of bytecode and executes it.

>5. What seems to be my "toplevel" parser (the creole parser)

Got me there.

>6. Bytecode

Perl's machine code. The stuff that gets fed to the interpreter.

>7. Syntax Tree

The parsed, tokenized, and cleaned-up version of the source. See the dragon 
book (or any good compiler book) for more details.

>And what language they should be in (if Larry's undefined language, just
>say C-Larry or something)

The parser shoud be mostly perl. The rest will be in a mix of something 
Cish and perl. (The Cish stuff will likely be run through a perl filter to 
produce real C, though it'll hopefully have features in it that'll rein in 
some of C's more error-prone features)


                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk
Re: To get things started...

Reply via email to