> -----Original Message-----
> Michael Hutchinson wrote:
> > > body NICE_GIRL_01       /Hello! I am (?:bored|tired) (?:today|this
> > > (?:afternoon|evening)|tonight)\./
> >
> > Forgive my ignorance, but what does the question mark and colon do
at
> > the start of the brackets? I have (bored|tired) in my own rules, so
how
> > does (?:bored|tired) affect the outcome?
> 
> Using (?: avoids creating backreferences.  It should be slightly
> faster if the backreference is not used.
> 
>   (?:bored|tired)
> 
> Is the same as:
> 
>   (bored|tired)
> 
> But without creating \1 or $1 reference to it.
> 
> SpamAssassin is written in Perl and uses PCRE (Perl Compatible Regular
> Expressions).  Those are not quite the same as standard Extended
> Regular Expressions.  For a full description see the 'perlre' man
page.
> 
>   man perlre
> 
>        "(?:pattern)"
>        "(?imsx-imsx:pattern)"
>                  This is for clustering, not capturing; it groups
>                  subexpressions like "()", but doesn't make
>                  backreferences as "()" does.  So
> 
>                      @fields = split(/\b(?:a|b|c)\b/)
> 
>                  is like
> 
>                      @fields = split(/\b(a|b|c)\b/)
> 
>                  but doesn't spit out extra fields.  It's also cheaper
>                  not to capture characters if you don't need to.
> 
>                  Any letters between "?" and ":" act as flags
>                  modifiers as with "(?imsx-imsx)".  For example,
> 
>                      /(?s-i:more.*than).*million/i
> 
>                  is equivalent to the more verbose
> 
>                      /(?:(?s-i)more.*than).*million/i
> 

Yay, less overhead... <runs off to tweak rules up>. 

Thanks for the pointers Bob, you've been a big help :)

Cheers,
Michael Hutchinson

Reply via email to