On Tue, Jul 05, 2005 at 12:45:11PM -0500, Patrick R. Michaud wrote:
: On Tue, Jul 05, 2005 at 08:51:39AM -0700, Larry Wall wrote:
: > 
: > For languages that cannot do one-pass parsing, it would be saner in
: > the long run for rules to delimit such code with delimiters that
: > are unlikely to occur in the target language.  Double curlies, or
: > here docs, or some such.  That's hacky, but counting brackets is
: > also hacky.  
: 
: Any chance we could identify such a set of delimiters and standardize
: them within the rules language, or at least within PGE?  

Sure.  For the moment it's still our language to do however we like.

: In the general case, Parrot's "compile" opcode doesn't (yet?) have
: the interface we'd really need or want in order to do the "parse 
: a program up to a closing delimiter and return the result" sort 
: of thing that we'd need.  And if a target language is more of
: an interpreter and not a compiler, then it might need a flag that
: indicates "just parse this, don't execute it yet".  Maybe we
: need a "parse" opcode, or a standard parse function-call interface
: that language parsers use to interface with PGE.

Some languages could have difficulty with that, I suppose.

: It could also be that one might wish to avoid parsing and/or 
: compiling a code block until it's actually encountered during 
: execution of the rule.

Hmm.  That's probably more in the province of { FooLang.eval("...") },
assuming FooLang doesn't have its own eval.

: Perl 6 rightfully grabs { ... } for its code blocks, and it's reasonable
: to expect the rules engine to have some intimate knowledge of the
: Perl 6 parser (and vice versa).  But from a more general "tool"
: perspective it seems like it'd be nice to have a set of delimiters 
: available for codeblocks that the rules parser could use without 
: having to communicate with another parser to handle it.  Even if 
: the chosen delimiters aren't a 100% solution, if they manage to cover
: a wide swath of the most common language syntaxes I think it'd be
: a win for most language and tool developers.
: 
: Technically, one could conceivably use the string-argument form of
: subrules (mentioned in A05) to achieve this:
: 
:      \d+  <tcl: ...tcl-code-here...>
: 
: but we'd have to have some mechanism to escape any '>' characters
: in the string argument.  And that could get pretty nasty, and we
: haven't really spec'd out the string delimiters here yet (do all
: backslash-escapes get processed)?

Yes, avoiding that kind of ambiguity is precisely why the Perl call
variant requires parens: <foo(...)>.  It would be yucky to reintroduce
it on behalf of other languages.  But, hey... :-)

: In the more general case it seems like it'd be really nice to
: have a generalized delimiter available, such as double-curlies,
: angle+double-curlies, or something that could be extracted without
: resorting to a language-specific parser to find the end of the code
: block:
: 
:     rx :code('tcl')  / \d+ {{ ...codeblock source... }} /
:     rx :code('tcl')  / \d+ <{{ ...codeblock source... }}> /
:     rx / \d+ <tcl{{ ...codeblock source... }}> /
: 
: Some other ideas are at the end of this message.

I think I'd rather see a :lang('tcl') option since :c is taken and
:l isn't.  But mostly people will want to put

    use rule :lang<tcl>;

or some such at the beginning of the file, since all the rule actions
are likely to be in the same language.

: It might also be worth mentioning that the format of the opening
: delimiter isn't terribly important -- it's just a closing delimiter token
: that PGE or a rules engine needs to scan for; and hopefully that closing
: delimiter is something that is unlikely to ever appear in a code block
: for a target language (or can be easily worked around when it does).

But from a human point of view it would be nice if it feels "nesty".

: Anyway, if this is really chasing down a dead-end, just say so and
: we'll go with the other approach.  But I feel like it simplifies things
: a lot (both for PGE and for compiler authors) if we can have a 
: generalized code block delimiter available.

We need to pursue both options.  Languages with left-to-right parsers
aren't going to want to put {{...}} everywhere.

: Pm
: 
: Some other random syntax possibilities--using chars not yet defined
: for angles:
: 
:     <| ... code block ... |>
:     <* ... code block ... *>
:     <` ... code block ... `>
:     <^ ... code block ... ^>
:     <~ ... code block ... ~>
: 
: Thus far I think I like {{...}}, <{{...}}>, or <`...`> the best.

I'd lean more toward the infinitely extensible PODly technique:

    {...}
    {{...}}
    {{{...}}}
    {{{{...}}}}
    ...

In any case, we probably want the outermost delimiters to be curlies
so that we can use these anywhere we allow Perl closures, such as
\d**{{ (range 1 5) }}.  Though perhaps Lisp is a bad example of a
bad language.  :-)

Larry

Reply via email to