# New Ticket Created by Bruce Gray # Please include the string: [perl #121024] # in the subject line of all future correspondence about this issue. # <URL: https://rt.perl.org/Ticket/Display.html?id=121024 >
Interpolation of /<$var>/ causes incorrect matches when $var contains alternation. TimToady weighs in during the last section of the log. 2014-01-17 15:22-15:45 < Util> r: my $s = "abcdefghtttaaccta"; my @pats = /ttta<[agt]>cct/, /z|ttta<[agt]>cct/; for @pats -> $pat { say $s.comb( /$pat/ ); } <+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd: OUTPUT«tttaaccttttaacct» < Util> r: my $s = "abcdefghtttaaccta"; my @pats = "ttta<[agt]>cct", "z|ttta<[agt]>cct"; for @pats -> $pat { say $s.comb( /<$pat>/ ); } <+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd: OUTPUT«tttaacctabcdefgh tttaacct» < Util> Adding 'z|' to the front of the pattern causes crazy matches, but only when the string pattern is interpolated into a re. < Util> Is this a bug in <$var> interpolation? If so, is it a known bug? If not, how am I mis-reading the results? < ingy> Util: can you serialize the regex after string interp? < Util> ingy: .perl method on a compiled regex does not output anything useful. I welcome new knowledge on how else to serialize a regex. < ingy> no clue. just guessing < Util> r: my $re = /abc/; say $re.perl; <+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd: OUTPUT«regex(Mu : Mu *%_) { ... }» < Util> I would love to be able to see what is actually in the $re. < FROGGS> p: my $r = "z|ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefgh」» < FROGGS> p: my $r = "ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「tttaacct」» < FROGGS> p: my $r = "ttta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「tttaacct」» < FROGGS> p: my $r = "y|ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefgh」» < FROGGS> it is like it still matches "tttaacct", but then forgets the position where the match started < FROGGS> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「ttaacct」» < FROGGS> p: my $r = "e|tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefg」» < FROGGS> see, it always has the same length < FROGGS> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<e|$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«===SORRY!=== Error while compiling /tmp/4mFkdy5iGBUnable to parse expression in metachar:sym<assert>; couldn't find final '>' at /tmp/4mFkdy5iGB:1------> t]>cct"; say "abcdefghttttaaccta" ~~ /<e⏏|$r>/ …» < FROGGS> in theory it should explode like that < Util> FROGGS: I observed that as well, but do not know what to make of it. < FROGGS> p: my $r = "e||tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefg」» < Util> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /e|<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「e」» < FROGGS> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /z|<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「ttaacct」» < FROGGS> that works well < Util> FROGGS: FYI, I think that /<e|$var>/ is incorrect syntax, no matter what is in $var < FROGGS> Util: correct < FROGGS> and that is why /<$var>/ should explode too it there is just a | in it < FROGGS> if* < Util> FROGGS: Are you asserting that alternation (|) is never valid in a interpolated regex? < FROGGS> Util: it would think that you have to add [ ] < FROGGS> a similar question would be if a + in such an assertion should be valid, and if it should be a quantifier for the thing on the left of the assertion < Util> I p: my $r = "z|abc"; say "abcde" ~~ /<$r>/ < Util> p: my $r = "z|abc"; say "abcde" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abc」» < FROGGS> p: my $r = "z|abc"; say "peterabcde" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「pet」» < Util> I would expect your "peterabce" example to have worked correctly. < Util> S05 says that /$var/ no longer interpolates like it did in Perl 5. /<$var>/ is how you get the old behavior. < FROGGS> p: my $r = "[z|abc]"; say "peterabcde" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abc」» < FROGGS> see < Util> I don't see anything *restricting* the old behavior when done as /<$var>/ < Util> (in S05) < FROGGS> it fails because it needs a group < Util> FROGGS: Thanks! That may be the breakthrough thought that I needed. < Util> Just like S05 says that /moose*/ matches multiple 'e', but /'moose'*/ matches multiple 'moose’. < Util> That might give me a workaround, but the current behavior is still a bug, IMO. < FROGGS> report it < Util> Will do. Thanks again! 2014-01-17 18:49-18:52 < TimToady> on / <$foo> /, the assertion is supposed to match as a subgroup, so it should not be necessary to supply [], and bare z|abc should work < TimToady> we do not provide any way to interpolate a string as a regex without it being a submatch, unless you use EVAL < japhb> TimToady: I read that discussion as resolving to: <$foo>'s implementation should just implicitly wrap the contents in []. < TimToady> that would be...unhygienic < TimToady> that's what a submatch means < japhb> Fair enough. < TimToady> well, more like (), in the sense that it hides inner () < TimToady> but not in the sense of supplying an outer () < TimToady> unless you bind it explicitly < TimToady> just as if you'd called <.foo> < japhb> gotcha. < TimToady> nothing is captured, and the () inside foo are hidden -- Thank you, Bruce Gray (Util of PerlMonks)