On Oct 28, 2008, at 10:06 AM, Patrick R. Michaud wrote:

On Tue, Oct 28, 2008 at 01:50:42AM -0500, Chris Dolan wrote:

My goal is to build arbitrarily complex data structures from closures
fired in my grammar. Specifically, I'm trying to write a PDF parser -- my grammar is parsing correctly now, but I'd rather not have to write the
closures in PIR if I can help it.

Would it make sense to use action methods (the "{*}" tokens) for the
closures instead of embedding them directly into the grammar?

Pm

Taking the {*} approach with the actions written in Perl6 instead of NQP (or PIR) turned out to be significantly harder than I expected, largely because I'm using a straight Perl6 grammar instead of PCT and HLLCompiler. I encountered three problems, both solved temporarily: 1) how to tell Perl6 which action class to use, 2) how to get my $/.result_object back, and 3) how to keep <ws> from blowing away my overridden match object.

=== begin gory details ===

The Grammar.ACCEPTS nor Code.ACCEPTS methods don't pass an action adverb to the PGE subrule, and there doesn't seem to be a way to set a default action on a grammar. So, I changed my test code from this:

      '1 0 R' ~~ PDF__Grammar::pdf_reference;
is(~$($/).WHAT, 'PDF__Syntax__Reference'); # fails, got 'Str' because the action is not invoked

to the following, to force the action class:

      my $method = PDF__Grammar::pdf_reference;
      my $m = '1 0 R'.$method(action => PDF__Grammar__Actions.new);
      is(~$m.WHAT, 'PDF__Syntax__Reference');  # succeeds

where my action looks like this:

    class PDF__Grammar__Actions {
       method pdf_reference($m) {
my $ref = PDF__Syntax__Reference.new(objnum => + $m<objnum>, gennum => +$m<gennum>);
          $m.result_object($ref);
          return;
       }
    }
    grammar PDF__Grammar {
       rule pdf_reference { <objnum> <gennum> R {*} };
    }

I found that a bit confusing because: a) because I'm invoking the subrule directly, $/ is not getting set and b) assigning back to $m at the end apparently calls get_scalar() on the Match instance, so it is dereferenced to get my result_object instead. Unexpected, but convenient. I might wrap it in a superfluous $() in case that get_scalar() is ever changed.

In retrospect, if I had been matching against the whole grammar instead of the subrule, my second problem would not have happened because Grammar.ACCEPTS sets $/.

The third problem was because I had whitespace between the {*} and the terminal } of the rule. That <ws> overrode $/ until I cuddled my rule like "rule pdf_reference {<objnum> <gennum> R {*}};" Is there an easy way to make my white space non-capturing? I'm sure I'm just overlooking something...

=== end gory details ===

So, I have two change proposals. I'm not sure if either of them are good ideas... 1) Make grammar rules be type Rule instead of Method, and add a custom ACCEPTS that behaves like Grammar.ACCEPTS. 2) Add a Parrot-specific adverb to the Perl6 "grammar" declaration to allow programmers to specify a default action class for the whole grammar.

And bug reports:
 1) Perl6 mangles Match instances when they are assigned to scalars
 2) The rules incorrectly require a closing ";" to avoid a syntax error

I'm happy to write up concise ticket for any of those if they aren't insane or already known.


Chris

Reply via email to