Plese respond to the list and:

  "Because it's up-side down".
  "Why is that?"
  "It makes replies harder to read."
  "Why not?"
  "Please don't top-post." - Sherm Pendley, Mac OS X Perl list

On 11/8/05, Jeff Pang <[EMAIL PROTECTED]> wrote:
>  ~/perl> perl -e'$abc="[abc]"; print "matched!\n" if "ab" =~ /($abc)\1/'
>
> but:
>
>   ~/perl> perl -e'$abc="[abc]"; print "matched!\n" if "ab" =~ /$abc$abc/'
>   matched!
>
> Why this happen?what is the difference between "/($abc)\1/" and "
> /$abc$abc/"?Thanks.
>
> 2005/11/9, Jay Savage <[EMAIL PROTECTED]>:
> > On 11/8/05, John W. Krahn <[EMAIL PROTECTED]> wrote:
> > > Tom Allison wrote:
> > > >     if ($text =~ /(.*?($crlf))\2(.*)/sm) {
> > > >
> > > > Do I read this right?
> > > >
> > > > the '\2' is a repeat character of the second match
> > > > where match \1 is (.*?$crlf) and
> > > > match \2 is $crlf ?
> > >
> > > Yes, but you don't really need the capturing parentheses there:
> > >
> > >     if ( $text =~ /(.*?$crlf)$crlf(.*)/s ) {
> > >
> > >
> > >
> >
> > That depends; we don't know the contents of $crlf (although we can
> > probably guess). But if $crlf has classes and/or logic, interpolating
> > the variable again will match any of the possiblilities, where the
> > backreference will only match the literal string previously matched.
> > Consider the following:
> >
> >   ~/perl> perl -e'$abc="[abc]"; print "matched!\n" if "aa" =~ /($abc)\1/'
> >   matched!
> >   ~/perl> perl -e'$abc="[abc]"; print "matched!\n" if "ab" =~ /($abc)\1/'
> >
> > but:
> >
> >   ~/perl> perl -e'$abc="[abc]"; print "matched!\n" if "ab" =~ /$abc$abc/'
> >   matched!

when a part of a regex is stored in a vairable, the contents of the
variable are interpolated before the regex is evaluated, so when the
match is performed,

    "ab" =~ /$abc/

becomes

   "ab" =~ /[abc]/

By the same token

   "ab" =~ /$abc$abc/

bcaomes

   "ab" =~ /[abc][abc]/

each class can match a or b or c, and the entire regex will match aa,
bb, cc, ab, ac, ba, bc, ca, or cb.

with

   /($abc)\1/

however, \1 isn't evaluated until *after* the capturing parentheses do
their work, and \1 is replaced with whatever was captured--essentially
the value of $1 at whatever point the engine reaches that point in the
expression.

the variable is interpolated, so the expression becomes

   /([abc])\1/

then the engine begins evaluating the expression. As soon as the
parentheses capture something, the engine goes through and replaces \1
with the literal string captured.

In out example then, ([abc]) matches "a" and stores the value "a" in
$1. Then all occurances of \1 are replaced with "a".

   /([abc])[abc]/

says "find me an a, b, or c, save it to $1,a dn find me another a, b, or c"

   /([abc])\1/

says "find me an a, b, or c, save it to $1, and then find me another
of whatever it is that was just found"

Of course, this only matters if the captured value is the result of
some logic or class operation.

   /(ab)ab/

and

   /(ab)\1/

and

   /(ab){2}/

are functionally equivalent, although the first one is more efficient
since it doesn't perform and capturing or substitution.

HTH,

-- jay
--------------------------------------------------
This email and attachment(s): [  ] blogable; [ x ] ask first; [  ]
private and confidential

daggerquill [at] gmail [dot] com
http://www.tuaw.com  http://www.dpguru.com  http://www.engatiki.org

values of β will give rise to dom!

Reply via email to