Michael Hutchinson wrote: > > body NICE_GIRL_01 /Hello! I am (?:bored|tired) (?:today|this > > (?:afternoon|evening)|tonight)\./ > > Forgive my ignorance, but what does the question mark and colon do at > the start of the brackets? I have (bored|tired) in my own rules, so how > does (?:bored|tired) affect the outcome?
Using (?: avoids creating backreferences. It should be slightly faster if the backreference is not used. (?:bored|tired) Is the same as: (bored|tired) But without creating \1 or $1 reference to it. SpamAssassin is written in Perl and uses PCRE (Perl Compatible Regular Expressions). Those are not quite the same as standard Extended Regular Expressions. For a full description see the 'perlre' man page. man perlre "(?:pattern)" "(?imsx-imsx:pattern)" This is for clustering, not capturing; it groups subexpressions like "()", but doesn’t make backreferences as "()" does. So @fields = split(/\b(?:a|b|c)\b/) is like @fields = split(/\b(a|b|c)\b/) but doesn’t spit out extra fields. It’s also cheaper not to capture characters if you don’t need to. Any letters between "?" and ":" act as flags modifiers as with "(?imsx-imsx)". For example, /(?s-i:more.*than).*million/i is equivalent to the more verbose /(?:(?s-i)more.*than).*million/i HTH, Bob