At 12:13 PM -0400 on 8/24/00, Mark-Jason Dominus wrote:
>The big problem I see that you didn't address is that you didn't say
>what would happen when the target string contains mismatched
>parentheses.
>
>Your example was:
>
> $string = "([b - (a + 1)] * 7)";
> $string =~ /\g.*?\G/;
>
>Now here \g matches the "(" and sets up \G so that \G will only match
>the corresponding ")". Then .*? matches "[b - (a + 1)] * 7" and \G
>matches the ")".
>
>Now suppose the string were
>
> $string = "(b - a + 1] * 7)";
> $string =~ /\g.*?\G/;
>
>Now what happens here? \g matches "(" and sets up \G so that \G will
>only match the corresponding ")". Then what? I'm not sure from your
>proposal.
>
>Your later example (in the 'implementation' section) suggests that '['
>and ']' are ignored once \g matches a '('. If that is true, then in
>the example above, the .*? would match "bb - a + 1] * 7". I think
>this won't be what people will want from \g...\G. We will still going
>to get a lot of questions from people asking how to tell if the
>delimiters in a string are balanced.
I think having .*? in the above example match "b - a + 1] * 7" is
reasonable, useful behavior. The simple algorithm outlined in the
Implementation Section of the RFC suffices to the normal case of
extracting information from strings with correctly balanced
delimiters. Detecting mismatched delimiters is a separate, probably
more difficult problem.
An alternative would be to have the algorithm keep closer track of
_all_ nested delimiters which don't correspond to a \g (instead of
merely the ones of the same species as the currently active \g, as
proposed in the RFC). The algorithm would fail to match if there
were unbalanced delimiters. Better, one could have the option of
choosing between the more-strict and less-strict matching rules.
This is less useful than it might at first seem, however, because one
would not know without additional work whether the expression failed
to match because of unbalanced delimiters or for some other reason.
(Regexes have only one way of failing.)