Ok thanks all, I didn't realize the 'm' thing was part of Perl regex. I came from using java regular expressions where I don't deal with the '/' and 'm' characters.
Bowie Bailey wrote: > > On 6/16/2011 11:05 AM, raiden031 wrote: >> So I'm trying to understand the spamAssassin rules, and I found a couple >> of >> things that don't make sense about the rules I downloaded. > > SpamAssassin uses Perl regular expressions. For more info, look that up > on Google and you should be able to find plenty of info. > >> 1) Some of the header, body, and uri rules have regular expressions that >> are >> not enclosed in '/' (ie. /pattern/i ). Instead they are enclosed with >> 'm' >> followed by another character. >> >> I have seen the following different ways to enclose a pattern: >> >> /pattern/i # As documented >> m{pattern}i >> m'pattern'i >> m%pattern%i >> m!pattern!i >> >> Example below: >> >> uri FU_END_ET m'/et/$'i >> >> Is this valid and if so, why is it being done? > > This is done to make a pattern more readable (mainly to avoid the > necessity of escaping the '/' character). > > For example, both of these patterns will match a series of three slashes: > > /\/\/\// > m'///' > > When you use the 'm' form, you can use almost any separator you want. > The examples you gave above are some of the most common. You generally > want to pick a separator character that does not appear in the expression. > > FYI: The 'i' at the end of the patterns above is a pattern modifier > that makes the expression case-insensitive. > >> 2) I don't understand the use of the '!~' for header rules. The >> documentation says '=~' means 'contains a regular expression' but '!~' >> means >> 'does not contain a regular expression'. Yet in both cases, there is a >> regular expression associated with the rule. >> >> header FH_FROMEML_NOTLD From:addr !~ /\./ [if-unset: f...@bar.com] >> describe FH_FROM_EML_NOTLD E-mail address doesn't have TLD (.com, etc.) >> >> For instance, could someone explain how the above rule works? It looks >> like >> to me it should hit whenever there is any From address populated >> regardless >> of whether there's a .com or not. > > The '!~' form means that the rule matches when the regular expression > does NOT match. > > For example: > > From:addr !~ /\./ > > The above pattern will match on any from address that does NOT contain a > period. > > -- > Bowie > > -- View this message in context: http://old.nabble.com/Bizarre-rule-definitions-tp31861533p31862044.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.