On 6/16/2011 11:05 AM, raiden031 wrote:
> So I'm trying to understand the spamAssassin rules, and I found a couple of
> things that don't make sense about the rules I downloaded.

SpamAssassin uses Perl regular expressions.  For more info, look that up
on Google and you should be able to find plenty of info.

> 1) Some of the header, body, and uri rules have regular expressions that are
> not enclosed in '/' (ie. /pattern/i ).  Instead they are enclosed with 'm'
> followed by another character.  
>
> I have seen the following different ways to enclose a pattern:
>
> /pattern/i   # As documented
> m{pattern}i
> m'pattern'i
> m%pattern%i
> m!pattern!i
>
> Example below:
>
> uri FU_END_ET m'/et/$'i
>
> Is this valid and if so, why is it being done?

This is done to make a pattern more readable (mainly to avoid the
necessity of escaping the '/' character).

For example, both of these patterns will match a series of three slashes:

    /\/\/\//
    m'///'

When you use the 'm' form, you can use almost any separator you want. 
The examples you gave above are some of the most common.  You generally
want to pick a separator character that does not appear in the expression.

FYI:  The 'i' at the end of the patterns above is a pattern modifier
that makes the expression case-insensitive.

> 2) I don't understand the use of the '!~' for header rules.  The
> documentation says '=~' means 'contains a regular expression' but '!~' means
> 'does not contain a regular expression'.  Yet in both cases, there is a
> regular expression associated with the rule.  
>
> header FH_FROMEML_NOTLD From:addr !~ /\./ [if-unset: f...@bar.com]
> describe FH_FROM_EML_NOTLD E-mail address doesn't have TLD (.com, etc.)
>
> For instance, could someone explain how the above rule works?  It looks like
> to me it should hit whenever there is any From address populated regardless
> of whether there's a .com or not.

The '!~' form means that the rule matches when the regular expression
does NOT match.

For example:

    From:addr !~ /\./

The above pattern will match on any from address that does NOT contain a
period.

-- 
Bowie

Reply via email to