On Fri, Mar 22, 2002 at 11:53:47AM -0800, Matthew Cline wrote:
> I recall reading that non-greedy regexps are more compute intensive than 
> greedy regexps, so this might cause a perfomance hit.

Yeah.  Basically, the different is greedy goes "let me match as much as I
can, then start backtracking to match the stuff after the ".+"".  Non-greedy
goes "let me match as little as possible and try to escape out as early as

So if you have long strings where ".*?" will match a lot of stuff before
hitting the next part of the regexp, the process will be trying to escape a


will actually catch a good amount of comments.  Since the -- parts are
actually not used everywhere, the regexp I've used (still not 100%,
but it gets most things is:

        <!                  # Comments start with <!
        ([^<>]|<[^<>]+>)*   # Remove anything in between, including
                            # the non-spec'ed included tags ...
        >                   # End of the comment.
    }{}gsx;                 # Replace with Nothing

Randomly Generated Tagline:
To Perl, or not to Perl, that is the kvetching.
              -- Larry Wall in <[EMAIL PROTECTED]>

Spamassassin-talk mailing list

Reply via email to