On 2011-10-24 01:12, Dave Funk wrote:

> Karsten's example is a clear win (efficiency) wise over Jakub's but
> it's also more restrictive. Because of the \b bounding on the
> outside, Karsten's rule will match "From: enlarge now <b...@ha.ya>"
> but not "From: enlargement now <b...@ha.ya>".
>
> That can be achieved by adding trailing character matches on those
> words that you want to be 'extendable'. EG:
>
>    header FOO  From:name =~ /\b(sex|free|trial|enlarge\w{0,5})\b/i
>
> This will match on "enlarge" "enlargement" "enlarged" etc and still
> keep the efficiency.
> Note that by using the 'word' match meta-character ('\w') rather than
> the generic wild-card match character ('.') you avoid back-tracking
> of the pattern-match engine (as well as putting a fixed size bounding
> on it).

As far as I know, with alternations you should use "?:" at their 
beginning to avoid (superfluous) memory usage:

header FOO  From:name =~ /\b(?:sex|free|trial|enlarge\w{0,5})\b/i

That prevents Perl from "remembering" which item of the alternation 
matched in your rule. With many rules scanning many mails using many 
alternations, that can make a signficant difference in memory usage / 
performance.

Hope this helps.

Regards,

wolfgang

Reply via email to