On Mon, Jul 21, 2003 at 04:40:19PM -0400, Jeff 'japhy' Pinyan wrote: > On Jul 21, Jeff 'japhy' Pinyan said: > > >This is the issue. Why are the $DIGIT variables bound to the block > >they're in IN TOTALITY, rather than for the life of the execution of the > >block? > > It's actually slightly more complex than that. > > Here's a piece of code like what Steve wrote: > > sub match { > ($1 + 1) =~ /(\d+)/; > print $1; > match() if $1 < 2; > print $1; > } > > "0" =~ /(\d+)/; > match(); > > Now, this code prints "1222". Odd. Baffling, even. Why doesn't it print > "1221"? Because the $DIGIT variables are not just magically scoped, > they're magic themselves. They are connected to the last successful > pattern match, yes, but more importantly, they are DIRECTLY connected to > the last PMOP (an internal structure representing the pattern match).
You're explaining this much more clearly than I had done, but let me jump in again -- The magic regex variables *themselves* live forever and don't obey any scoping rules. They don't have to worry about scope, since as you said, they don't contain any data. Their values are fetched dynamically by looking at the "last match" (which is what I've been calling PL_curpm, which is the dynamically scoped PMOP pointer). PL_curpm behaves consistently, although the way it's dynamically scoped is slightly unusual, as you said. But the PMOP doesn't contain any data *either*. It has a pointer to REGEXP structure, which contains, among other things, the compiled pattern and what I'll call the "match data". The match data might include a copy of the target string and offsets for each pair of capturing parens, and it can be used to calculate the value of $1 or @- (or a host of other variables) dynamically. The problem with this set-up is that PL_curpm is dynamically scoped, but the REGEXP, which contains the data we're interested in, isn't. Tying the match data to the compiled pattern (and thence to the PMOP, for pity's sake) is, arguably, bad design... You can also see it misbehaving here: my $rx = qr/(...)/; # REGEXP 1 "foo" =~ /$rx/; # PMOP 1 / REGEXP 1 { "bar" =~ /$rx/; } # PMOP 2 / REGEXP 1 print $1; # "bar" In this one there are two distinct PMOPs (the m// operations) but only one REGEXP, which is what we've stored in $rx. When we print $1 the chain of references looks something like $1 -> PL_curpm -> <PMOP #1> -> $rx -> <match data> [ Really the PMOP points directly to the REGEXP inside $rx. ] And the match data inside $rx comes from the time it matched "bar", since there's no mechanism for saving and restoring that kind of thing. Anyway, apologies for the blood and perlguts -- -- Steve -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]