Edward Peschko writes:
> Ok, fair enough.. although I'm not sure that I'm all that sure I'm completely
> happy-with/understand the syntax described in that article. It works for the trivial 
> cases, but what about complex grammars? 

It works for anything.  It gets pretty inefficient in the case of code
assertions, but there's no way around that.  Testing your assertions
isn't going to be the useful thing about this, anyway.

I just defined for Rule::Group, which was a simple concatenation of
elements.  The others follow from that.  At the end of this message I've
rewritten Group and written a few more just to show how it's done.  I've
annotated them, too, so maybe someone will be able to understand it.

> The reason for the modifier (or even a new operator (g/" for example) is that 
> you can easily test your regular expressions. The interface is trivial - all you have
> to do is switch your m/ out for g/, and sit back and see how your patterns translate
> into strings.  

Yeah, that looks pretty easy.  Until you see what that looks like in a
program.  What does it mean to change an m// to a g//?

    if $str ~~ m/foo bar/ {...}

Changes to:

    say for g/foo bar/;

Which isn't all that different from:

    say for generate(/foo bar/);

> Eyeballing and fixing the regular expression then becomes trivial (or relatively 
> trivial).

I definitely see the use.

> If you need to match the regex engine in reverse, in a totally unattached way 
> via subroutine, then I would think the chance for subtle mistakes and errors 
> would be exceedingly great.

I don't understand how.

> Or, I could be missing something. How would you generalize 
> 
>       multi generate Rule::Group $group: Int $n) ..
> 
> to work with 
> 
>       (( <nonquote> | <slashchar> )* ")
> 
> as input?

I'll show you.  Here are some of the generators.  This is very dense,
functional code.  Read at your own risk (but I'm certainly not writing
it to be executed!).

    use Permutations <<compositions outer>>;

    # compositions($length, $n) gives all lists of length $n whose
    # elements sum to $length.

    # outer(@ary1, @ary2, ...) gives the cartesian product of its
    # arguments.  That is, outer([1,2], [3,4]) gives ([1,3], [1,4],
    # [2,3], [2,4]);  Also note that outer([1,2], [], [3,4]) gives
    # simply ([]).

    # Generate all strings of length $length that $group matches.
    multi generate(Rule::Group $group: Int $length) {

        # For each assignent of lengths to each of $groups children
        # such that they sum to $length...
        compositions($length, +$group.children) ==> 
        map -> @comp {
            @comp  $group.children ==> 
            map -> $n, $pat {
                
                # Generate every string of length $n that the subpattern
                # matches
                [ $pat.generate($n) ]
            } ==>

            # Join our results together
            outer ==> join ''
        }
    }

    # Generate all strings of length $length that $const matches.
    multi generate(Rule::Constant $const: Int $length) {
        
        # This is nice and easy
        if $length == $const.chars {
            $const
        }
        else {
            ()
        }
    }

    # Generate all strings that any of the subpatterns match.
    multi generate(Rule::Alternation $alt: Int $length) {
        $alt.children ==> 
        map -> $child {
            $child.generate($length)
        }
    }

Etc.  There would need to be some context added so that captures could
be made to work, but it wouldn't be that hard.  It would stop being so
elegant, but it wouldn't be hard.  

And then to generate all possibilities of your regular expression:

>       (( <nonquote> | <slashchar> )* ")

You just pass it to generate.  It recursively calls itself until it's
down to constant strings, which are always a well-defined match (or
character classes, which are tricky and computationally intractible in
the presence of unicode).

I love programming in theoretical Perl 6.

Luke

Reply via email to