Simply great .. thanks

On 6/1/07, Paul Lalli <[EMAIL PROTECTED]> wrote:
On Jun 1, 4:54 am, [EMAIL PROTECTED] (Sharan Basappa) wrote:
> I have a script as follows  :
>
> $str = "once upon a time
>         once upon a time";
> @store = $str =~ m/(once)/g;
> print @store ;
>
> This outputs "onceonce"
> How come regex is searching beyond newline. I thought the search will
> stop after first once.

What led you to believe that?  There is nothing in that regex that
says "stop after the first newline"

> When I replace /g with /m, the output I get is "once", but I thought /m will
> tell regex at multiple lines for match.

That is the mnemonic device, yes, but what it actually does is allow
the ^ token to match after a newline and the $ character to match
before a newline, rather than just the beginning and end of string.
So effectively, ^ and $ match the beggining/ending of lines, rather
than strings.

Your regexp does not involve ^ or $, so /m is completely irrelevent.

If you remove the /g modifier, your pattern matches only once.
Regardless of any other modifiers, if you want to search for more than
one occurrence of the pattern, you need the /g modifier.

> Also when I replace /g with /s, I still get output "once"

Again, without the /g modifier, the pattern matches only once.  /s is
also irrelevant.  While the mnemonic for this one is "single line",
what it actually does is allow the . wildcard to match any character
including the newline.  Normally it matches any character except the
newline.  Again, you have no . in your pattern, so /s is irrelevant.

> Can someone demystify this for me ?
> Is my assumption that regex will stop after encountering first newline is
> applicable only when dot* type of regex is used ?

Ah.  Now I understand your confusion.  It is not the regexp that stops
matching.  It is the . wildcard.  The . does not match a newline
character, unless you provide the /s modifier.  Therefore, the string
"onex\ntwox" will match /o(.*)x/ by setting $1 to 'on'.  This is what
you've interpreted by "stopping after the first newline".  The regexp
engine didn't stop.  It's just that the . ran out of sequential
characters that it could match.  If you add the /s modifier, then $1
will become "nex\ntwo", because now the . wildcard will match the
newline.

For more info:
perldoc perlretut
perldoc perlre
perldoc perlreref

Hope this helps,
Paul Lalli


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to