> -----Original Message----- > Michael Hutchinson wrote: > > > body NICE_GIRL_01 /Hello! I am (?:bored|tired) (?:today|this > > > (?:afternoon|evening)|tonight)\./ > > > > Forgive my ignorance, but what does the question mark and colon do at > > the start of the brackets? I have (bored|tired) in my own rules, so how > > does (?:bored|tired) affect the outcome? > > Using (?: avoids creating backreferences. It should be slightly > faster if the backreference is not used. > > (?:bored|tired) > > Is the same as: > > (bored|tired) > > But without creating \1 or $1 reference to it. > > SpamAssassin is written in Perl and uses PCRE (Perl Compatible Regular > Expressions). Those are not quite the same as standard Extended > Regular Expressions. For a full description see the 'perlre' man page. > > man perlre > > "(?:pattern)" > "(?imsx-imsx:pattern)" > This is for clustering, not capturing; it groups > subexpressions like "()", but doesn't make > backreferences as "()" does. So > > @fields = split(/\b(?:a|b|c)\b/) > > is like > > @fields = split(/\b(a|b|c)\b/) > > but doesn't spit out extra fields. It's also cheaper > not to capture characters if you don't need to. > > Any letters between "?" and ":" act as flags > modifiers as with "(?imsx-imsx)". For example, > > /(?s-i:more.*than).*million/i > > is equivalent to the more verbose > > /(?:(?s-i)more.*than).*million/i >
Yay, less overhead... <runs off to tweak rules up>. Thanks for the pointers Bob, you've been a big help :) Cheers, Michael Hutchinson