On Fri, May 13, 2005 at 03:36:50PM +0000, Luke Palmer wrote:
> I'm basically saying that you should treat your:
>     $str ~~ /abc :: def | ghi :: jkl | mn :: op/;
> As:
>     $rule = rx/abc :: def | ghi :: jkl | mn :: op/;
>     $str ~~ /^ .*? <$rule>/;
> Which means that you fail the rule, your .*? advances to the next
> character and tries the rule again.

Taking this explanation literally, this would mean that

    $rule = rx/abc :: def | ghi :: jkl | mn :: op/;
    $rule = rx/abc ::: def | ghi ::: jkl | mn ::: op/;

both succeed against "xyzabc---ghijkl".  But even just considering
the :: instance, this interpretation doesn't match what you said 
in your original message that :: would fail the rule without 
further advancing:

Pm> $rule =3D rx :w / plane :: (\d+) | train :: (\w+) | auto :: (\S+) / ;
Pm> "travel by plane jet train tgv today" ~~ $rule

LP> When you fail over the :: after plane, it skips out of the alternation
LP> looking for something to backtrack before it.  Since there is nothing,
LP> the rule fails.

> Maybe I'm misunderstanding your interpretation (when in doubt, explain
> with code).

One of us is misunderstanding the other.  I'll explain with code, 
but first let's clarify the difference.  I read your first message as 
claiming that

    $r1 = rx / abc :: def | ghi :: jkl | mn :: op /;
    $r2 = rx / abc ::: def | ghi ::: jkl | mn ::: op /;
    $r3 = rx / [ abc :: def | ghi :: jkl | mn :: op ] /;

are equivalent.  I believe $r2 and $r3 are not equivalent.  
For comparison, let's first look at a slightly different example, 
and let's avoid subrules they don't provide the auto-advance
of unanchored patterns that forms the crux of my question.

First, I'm quite certain that $r2 and $r3 are different.  For
illustration, let's use a variation like:

    $q2 = rx / \w [ abc ::: def | ghi ::: jkl | mn ::: op ] /;
    $q3 = rx / \w [ [ abc :: def | ghi :: jkl | mn :: op ] ]/;

    "xyzabc---xyzghijklmno" ~~ $q2     # fails after seeing "zabc"
    "xyzabc---xyzghijklmno" ~~ $q3     # matches "zghijkl"

The difference is precisely the difference between ::: and :: --
the former fails the rule entirely, while the latter simply fails
the current group (of alternations) and tries again.  
With :::, an unanchored rule should also stop its process of 
"advancing to the next character and trying again".  
(Otherwise,  "abefgh" ~~ rx / [ ab ::: cd | ef ::: gh ] / succeeds.)

So, by analogy

    $r2 = rx / abc ::: def | ghi ::: jkl | mn ::: op /;
    $r3 = rx / [ abc :: def | ghi :: jkl | mn :: op ] /;

    "xyzabc---xyzghijklmno" ~~ $r2     # fails after seeing "abc"
    "xyzabc---xyzghijklmno" ~~ $r3     # matches "ghijkl"

The :: in $r3 doesn't cause the entire rule to fail, just the
group, so the match is free to backtrack and continue its
"advance to the next character and try again".  (What the "::"
in $r3 *does* do is to tell the matching engine to not bother 
trying the remaining alternatives once it has seen an "abc" at
this point.)

So, going back to the original

    $r1 = rx / abc :: def | ghi :: jkl | mn :: op /;

does it work like $r2 or $r3?  My gut feeling is that it should 
work like $r2 -- i.e., that once we find an "abc" we'll fail the rule
if there's not a "def" following.  This also accords with what 
others have written in reply, when they say that all three of my
expressions fail in the same way (even though they do not).

However, *if* we say that :: at the top level fails the rule, that
means that as things currently stand

    $z1 = rx :w /foo/;
    $z2 = rx /:w::foo/;
    $z3 = rx /[:w::foo]/;

can be a little surprising:

    "hello foo" ~~ $z1         # matches "foo"
    "hello foo" ~~ $z2         # fails immediately upon the 'h' != 'f'
    "hello foo" ~~ $z3         # matches "foo"

which was the point of my original post.  And as I said there, I don't
have a problem with this, I just wanted to make this result didn't
surprise too many others.

I hope this was clear enough  -- if not, explain counter examples
in code.  :-)

Pm

Reply via email to