# New Ticket Created by  Bruce Gray 
# Please include the string:  [perl #121024]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org/Ticket/Display.html?id=121024 >


Interpolation of /<$var>/ causes incorrect matches when $var contains 
alternation.
TimToady weighs in during the last section of the log.

2014-01-17 15:22-15:45
< Util> r: my $s = "abcdefghtttaaccta"; my @pats = /ttta<[agt]>cct/, 
/z|ttta<[agt]>cct/; for @pats -> $pat { say $s.comb(  /$pat/  ); }
<+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd: 
OUTPUT«tttaacct␤tttaacct␤»
< Util> r: my $s = "abcdefghtttaaccta"; my @pats = "ttta<[agt]>cct", 
"z|ttta<[agt]>cct"; for @pats -> $pat { say $s.comb( /<$pat>/ ); }
<+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd: 
OUTPUT«tttaacct␤abcdefgh tttaacct␤»

< Util> Adding 'z|' to the front of the pattern causes crazy matches, but only 
when the string pattern is interpolated into a re.
< Util> Is this a bug in <$var> interpolation? If so, is it a known bug? If 
not, how am I mis-reading the results?

< ingy> Util: can you serialize the regex after string interp?
< Util> ingy: .perl method on a compiled regex does not output anything useful. 
I welcome new knowledge on how else to serialize a regex.
< ingy> no clue. just guessing
< Util> r: my $re = /abc/; say $re.perl;
<+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd: 
OUTPUT«regex(Mu : Mu *%_) { ... }␤»
< Util> I would love to be able to see what is actually in the $re.

< FROGGS> p: my $r = "z|ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefgh」␤␤»
< FROGGS> p: my $r = "ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「tttaacct」␤␤»
< FROGGS> p: my $r = "ttta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「tttaacct」␤␤»
< FROGGS> p: my $r = "y|ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefgh」␤␤»

< FROGGS> it is like it still matches "tttaacct", but then forgets the position 
where the match started
< FROGGS> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「ttaacct」␤␤»
< FROGGS> p: my $r = "e|tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefg」␤␤»
< FROGGS> see, it always has the same length
< FROGGS> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<e|$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«===SORRY!=== Error while compiling 
/tmp/4mFkdy5iGB␤Unable to parse expression in metachar:sym<assert>; couldn't 
find final '>' ␤at /tmp/4mFkdy5iGB:1␤------> t]>cct"; say "abcdefghttttaaccta" 
~~ /<e⏏|$r>/␤   …»
< FROGGS> in theory it should explode like that
< Util> FROGGS: I observed that as well, but do not know what to make of it.

< FROGGS> p: my $r = "e||tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefg」␤␤»
< Util> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /e|<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「e」␤␤»
< FROGGS> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /z|<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「ttaacct」␤␤»
< FROGGS> that works well
< Util> FROGGS: FYI, I think that /<e|$var>/ is incorrect syntax, no matter 
what is in $var
< FROGGS> Util: correct
< FROGGS> and that is why /<$var>/ should explode too it there is just a | in it
< FROGGS> if*
< Util> FROGGS: Are you asserting that alternation (|) is never valid in a 
interpolated regex?
< FROGGS> Util: it would think that you have to add [ ]
< FROGGS> a similar question would be if a + in such an assertion should be 
valid, and if it should be a quantifier for the thing on the left of the 
assertion 
< Util> I p: my $r = "z|abc"; say "abcde" ~~ /<$r>/
< Util> p: my $r = "z|abc"; say "abcde" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「abc」␤␤»
< FROGGS> p: my $r = "z|abc"; say "peterabcde" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「pet」␤␤»
< Util> I would expect your "peterabce" example to have worked correctly.
< Util> S05 says that /$var/ no longer interpolates like it did in Perl 5. 
/<$var>/ is how you get the old behavior.
< FROGGS> p: my $r = "[z|abc]"; say "peterabcde" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd: OUTPUT«「abc」␤␤»
< FROGGS> see
< Util> I don't see anything *restricting* the old behavior when done as 
/<$var>/
< Util> (in S05)

< FROGGS> it fails because it needs a group
< Util> FROGGS: Thanks! That may be the breakthrough thought that I needed.
< Util> Just like S05 says that /moose*/ matches multiple 'e', but /'moose'*/ 
matches multiple 'moose’.

< Util> That might give me a workaround, but the current behavior is still a 
bug, IMO.
< FROGGS> report it
< Util> Will do. Thanks again!

2014-01-17 18:49-18:52
< TimToady> on / <$foo> /, the assertion is supposed to match as a subgroup, so 
it should not be necessary to supply [], and bare z|abc should work
< TimToady> we do not provide any way to interpolate a string as a regex 
without it being a submatch, unless you use EVAL
< japhb> TimToady: I read that discussion as resolving to: <$foo>'s 
implementation should just implicitly wrap the contents in [].
< TimToady> that would be...unhygienic
< TimToady> that's what a submatch means
< japhb> Fair enough.
< TimToady> well, more like (), in the sense that it hides inner ()
< TimToady> but not in the sense of supplying an outer ()
< TimToady> unless you bind it explicitly
< TimToady> just as if you'd called <.foo>
< japhb> gotcha.
< TimToady> nothing is captured, and the () inside foo are hidden

-- 
Thank you,
Bruce Gray (Util of PerlMonks)

Reply via email to