The new range quantifier syntax has been bothering me. For reference, here's the bit of S5 that talks about it:
> The repetition specifier is now **{...} for maximal matching, with a > corresponding or **{...}? for minimal matching. Space is allowed on > either side of the asterisks. The curlies are taken to be a closure > returning a number or a range. > > / value was (\d ** {1..6}?) with ([\w]**{$m..$n}) / > > It is illegal to return a list, so this easy mistake fails: > > / [foo]**{1,3} Now for the bothersome parts and some questions and some suggestions in no particular order: - for minimal matching the ? is too far away from the operator that it applies to. It looks like it's doing something to the closure (and maybe it is) Should that be [foo]**?{$m..$n} instead? - Must the closure take the exact form of stuff in curlies? What would these do? $c = sub { 0..5 }; /[foo]**$c/; # error? /[foo]**&somesub/; # error? - Is the rationale behind making [foo]**{1,3} illegal strictly to catch the semantic error of those migrating from perl 5? Because it certainly seems like it could be a useful thing otherwise. - because the closure is executed first, you have to read ahead to the end of the closure and then look back to see what you were quantifying when trying to grok the code. This isn't such a big deal if you just have a range, but it's a closure so all sorts of things can be in there! - Bringing a closure into the picture seems to put too much power in such a simple construct. [foo]**{ destroy_the_world; 0... } - I've always viewed the minimal matching ? as a kind of modifier on either the quantifiers. If that illusion is to remain true in Perl6, I'd want an optional colon [foo]*:? Whitespace would disambiguate the "modifier colon" from the "no backtrack" or "cut" operator (it would parse as [ foo ] * :? I also seem to recall already have a whitespace disambiguation rule for ::). And if we apply this idea to the range quantifier, that would give us something like these: [foo]*:5 # match exactly 5 times [foo]*:{0...} # verbose [foo]* [foo]*:{1...} # verbose [foo]+ [foo]*:{1..5} # match from 1 to 5 times [foo]*:{[1,3,5]} # match exactly 1, 3, or 5 times [foo]*:[EMAIL PROTECTED] # treat each element of @foo as a # number and only match that # many times. (same as previous # basically) [foo]*:{&foo} # match based on the return value of &foo [foo]*:{%foo} # ??? Those last few suddenly make me want junctioned ranges, though I don't know what I'd use them for :) - An alternate syntax was proposed on IRC yesterday. I'm not sure if I remember the specifics right, but the gist of it is to use a ~ character to offset the ranges, so ... [foo]~5 # match exactly 5 times [foo]~{0...} # verbose [foo]* [foo]~{1...} # verbose [foo]+ [foo]~{1..5} # match from 1 to 5 times [foo]~{[1,3,5]} # match exactly 1, 3, or 5 times [EMAIL PROTECTED] # treat each element of @foo as a # number and only match that # many times. (same as previous # basically) [foo]~{&foo} # match based on the return value of &foo [foo]~{%foo} # ??? And surely these can be made to work: [foo]~[0...] # [foo]:[0...] [foo]~[1,3,5] # [foo]:[1,3,5] [EMAIL PROTECTED] # [foo]:@foo Yes, I realize that the "bag" variants (e.g., /[foo]*:[EMAIL PROTECTED]/) could be nightmarish for optimization (e.g. you can't assume monotonically increasing values) And would "minimal match" mean stop when you've reached the first number in the list or do you have to evaluate the whole thing and literally find the minimum value? (Similar reasoning and questions apply for the regular greedy version) These may be really good arguments for not including that particular variant, but I don't know that :-) ---- On the whole, I liked the simplicity of the old <$m..$n> (or even <$m,$n>) and would like something just like it only without the ambiguity of <$m>. I'd even suggest <+$m> as a disambiguating mechanism if we weren't using + and - for "character" classes. -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]