[perl #130612] [BUG] LTM doesn't use text order for tie break as expected

Sam S. via RT Mon, 11 Sep 2017 14:03:06 -0700

Actually...

Rakudo *does* generally follow interpretation (b):


    ➜  'x' ~~ / .* { say '*' } | .? { say '?' } /;  # *
    ➜  'x' ~~ / .? { say '?' } | .* { say '*' } /;  # ?

The observed bug is specifically with character classes:
    
    ➜  '1' ~~ /<digit>  { say 'digit' } | <[0..9]> { say '0..9' }  /;  # 0..9
    ➜  '1' ~~ /<[0..9]> { say '0..9' }  | <digit>  { say 'digit' } /;  # 0..9

Following some more experimentation, here are various atoms for matching the 
digit '1', sorted into three categories based on how much LTM favors them in 
current Rakudo:

    tier 1:  '1'
    tier 2:  .   \d   \w   <[0..9]>
    tier 3:  <digits>   <alnum>   <:Number>   <:Decimal>   etc.

That the literal `1` is preferred over everything else by LTM is to be expected 
("longest literal prefix" tie-breaker).

However, that the character classes are split into two tiers - with the 
syntactic ones being preferred over the named and uniprop ones - seems strange.
At least I don't see anything in S05 to back that up.

http://design.perl6.org/S05.html#Overview says that LTM is transitive through 
subrules, so even if the named character classes are treated as subrule calls 
they shouldn't be disfavored, right?

[perl #130612] [BUG] LTM doesn't use text order for tie break as expected

Reply via email to