On Fri, 2012-10-05 at 10:17 -0700, Cathryn Mataga wrote: > Thanks for the comments. I'll see if I can cook something up here. > Someone asked to see the > actual messages. > > I collected 4 of these messages and put them at this link. > > http://www.mataga.net/mataga/spam.txt > Here's another version. This successfully recognises all four of your examples and doesn't fire on any of my other spam test messages:
describe MG_TWOLETTER_OBFUSCATION Two letter obfuscation (X:X X :X)) header MG_TWOLETTER_OBFUSCATION Subject =~ /[A-Z][:%~;^][A-Z]\s{0,1}[:%~;^][A-Z0-9]/ score MG_TWOLETTER_OBFUSCATION 5.0 This rather longer regexp was wrapping when pasted into this reply, so I split the line at 'Subject' for clarity. Martin PS: this may be a well-known trick, but I haven't seen it mentioned here: current versions of grep will execute Perl regexes if you use the -P option, so rather than writing writing a rule using a new /regex/ you can rapidly debug it first using grep to execute it and command line editing to modify it: grep -P 'regex' corpus/testmessages* noting that grep doesn't use Perl delimiters ('/') round the regex. Then when the regex is more or less working you can write the rule and hammer it some more.