On Fri, Mar 22, 2002 at 11:53:47AM -0800, Matthew Cline wrote:
> I recall reading that non-greedy regexps are more compute intensive than
> greedy regexps, so this might cause a perfomance hit.
Yeah. Basically, the different is greedy goes "let me match as much as I
can, then start backtracking to match the stuff after the ".+"". Non-greedy
goes "let me match as little as possible and try to escape out as early as
possible."
So if you have long strings where ".*?" will match a lot of stuff before
hitting the next part of the regexp, the process will be trying to escape a
lot.
<!--.+?-->
will actually catch a good amount of comments. Since the -- parts are
actually not used everywhere, the regexp I've used (still not 100%,
but it gets most things is:
$it=~s{
<! # Comments start with <!
([^<>]|<[^<>]+>)* # Remove anything in between, including
# the non-spec'ed included tags ...
> # End of the comment.
}{}gsx; # Replace with Nothing
--
Randomly Generated Tagline:
To Perl, or not to Perl, that is the kvetching.
-- Larry Wall in <[EMAIL PROTECTED]>
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk