On Jul 23, Tom Allison said:

($lineExtract, $_) = /(^\w+\.dat\|)(.+)$/o;

(I like to add the regex option 'o' at the end to improve performance.)

This is a common misconception. The purpose and effects of the /o modifier (as well as the /s and /m modifiers) are unclear to a lot of Perl programmers out there. I'll make a PSA now and try to clear things up a bit.

The /o modifier tells the internal regex compiler that, after this regex has been compiled 'o'nce, it is never to be compiled again. "Well, what good is that?" you ask. For your average regex, there is absolutely no difference, no change in performance: /foo/ and /foo/o are identical. The place where the /o modifier matters is when there are variables inside the regex, e.g. /^$field: (.*)/. The /o modifier says that, after the regex has been compiled for the first time -- which means that all the variables in it have been interpolated -- *that* compiled regex will take the place of the regex with the variables in it. Any changes to your variables will be ignored, for the rest of the program. There's no way to reverse the effect of /o.

The /s modifier often carries the mnemonic "so you can match your string as though it were a single line" and that all of the sudden makes people think all sorts of crazy things. The /m modifier is often said to "make your regex match over multiple lines" and this too makes people do weird things. I've seen code like this:

  while (<FILE>) {
    print $1 if /Start(.*)End/ms;
  }

The person has a file with "Start" on one line and "End" on another, and they're not sure why their regex doesn't match the stuff in between. The reason is because the /m and /s modifiers change the *regex* and nothing else. In that code above, you're still only reading one physical line of text from FILE at a time.

So what do /m and /s do? It's very simple. The only thing the /s modifier does is make the . metacharacter match newlines. That's all. If there's no . in your regex, there's no need for you to add an /s to your regex. The /m modifier makes ^ and $ match the beginning and end of "lines" -- that is, it makes ^ match after any newline in your string (as well as at the absolute beginning of your string), and it makes $ match before any newline in your string (as well as at the absolute end of the string).

So there you have it. I can go into more detail about /o and regex compilation (and have before, probably on this list), but for now, what I've told you is all you need to know.

--
Jeff "japhy" Pinyan         %  How can we ever be the sold short or
RPI Acacia Brother #734     %  the cheated, we who for every service
http://japhy.perlmonk.org/  %  have long ago been overpaid?
http://www.perlmonks.org/   %    -- Meister Eckhart

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to