Yo! I just thought some might be interested - here are my custom rules, and my custom scores. I get about 1 false positive per week, out of 2000 mails or so (mostly mailing lists).
> ok_languages bs ca cz da de en es fr gd it sv tr > > # I assume OE to always add a Message-Id: header > meta vbi_FAKE_OE MSG_ID_ADDED_BY_MTA_3 && USER_AGENT_OE > describe vbi_FAKE_OE mail apparently coming from OE but without Message-Id Admittedly, this one fires almost never (11 out of >5000). But no false positives either > header vbi_BOGOFILTER X-Bogosity =~ /^Yes/ > describe vbi_BOGOFILTER mail was tagged as spam by bogofilter > > header vbi_SUBJECT_EXCL Subject =~ /!$/ > describe vbi_SUBJECT_EXCL Subject header ends in ! This one obviously gives the occasional false positive. Depends highly on the style of communication you normally have. > body vbi_DEBIAN /Debian/ > describe vbi_DEBIAN Occurence of the word 'Debian' The Debian mailing list footers don't say Debian with the capital D, so this is a very good one for me. > header vbi_FOXMAIL X-mailer =~ /FoxMail/ > describe vbi_FOXMAIL Compensate for various foxmail oddities Mostly, the problem is that FoxMail Base64-encodes the mail body. > header vbi_X_ROT exists:X-Rot > describe vbi_X_ROT X-Rot header used to uniquely identify spam mails As discussed recently - I'll have yet to see if this catches anything for me. > # scores for my own tests > score vbi_BOGOFILTER 3.0 With a 5M ham and 1M spam database, I find bogofilter is good enough that I may want to rise that to 4.0, and award a -0.5 or so score to the bogofilter-non-spam mails. > score vbi_DEBIAN -2.0 > score vbi_FAKE_OE 1.5 > score vbi_FOXMAIL -3.0 > score vbi_SUBJECT_EXCL 0.9 This one is probably too high. > score vbi_X_ROT 4.0 > > # modified scores for spamassassin tests I arrive at these by looking at the false negatives and false positives only. No sophisticated analysis at all. I might add that I do not often receive HTML mail; also, I know only few people at AOL/hotmail/yahoo. > score BALANCE_FOR_LONG_20K -0.308 # -0.708 > score BALANCE_FOR_LONG_40K -0.023 # -0.123 > score CTYPE_JUST_HTML 1.207 # 0.407 > score DATE_IN_PAST_06_12 0.348 # 0.448 > score FROM_ENDS_IN_NUMS 0.693 # 0.893 > score FORGED_HOTMAIL_RCVD 2.479 # 1.479 > score FORGED_YAHOO_RCVD 2.252 # 1.352 > score HTML_50_70 1.105 # 0.305 > score HTML_FONT_COLOR_BLUE 0.605 # 0.205 > score HTML_FONT_COLOR_RED 0.715 # 0.315 > score HTML_FONT_FACE_ODD 1.025 # 0.325 > score HTML_WITH_BGCOLOR 1.417 # 0.317 > score LINES_OF_YELLING 0.412 # 0.212 > score MAILTO_WITH_SUBJ 0.919 # 0.419 > score MAILTO_WITH_SUBJ_REMOVE 1.101 # 0.601 > score MIME_EXCESSIVE_QP 1.477 # 0.977 > score MIME_LONG_LINE_QP 1.024 # 0.324 > score MISSING_MIMEOLE 0.801 # 0.501 > score NO_REAL_NAME 1.185 # 1.285 > score NOSPAM_INC -0.111 # -0.211 > score PGP_SIGNATURE_2 -2.708 # -0.708 > score PGP_SIGNATURE -2.506 # -0.506 > score PRIORITY_NO_NAME 1.123 # 1.023 > score RAZOR_CHECK 2.040 # 2.640 > score RCVD_IN_DSBL 3.050 # 3.250 > score RCVD_IN_OSIRUSOFT_COM 0.180 # 0.380 > score RCVD_IN_RELAYS_ORDB_ORG 0.410 # 0.610 > score RCVD_IN_RFCI 1.780 # 2.280 > score REMOVE_IN_QUOTES 0.603 # 0.403 > score REMOVE_PAGE 1.206 # 0.706 > score SUBJ_ALL_CAPS 0.983 # 0.483 > score SUBJECT_IS_LIST -0.317 # -0.217 > score SUPERLONG_LINE 0.059 # 0.009 > score X_LOOP -0.033 # -0.233 > score X_MAILING_LIST -0.102 # -0.302 > score X_OSIRU_DUL 0.320 # 0.620 > score X_OSIRU_DUL_FH 0.260 # 0.360 > score X_OSIRU_OPEN_RELAY 2.220 # 2.720 > score X_OSIRU_SPAM_SRC 2.030 # 2.730 One of the main problems were all the DNS blacklists - I really had more than just the occasional false positive there. Oh, yes: I use the 2.43 debian packaged spamassassin version. cheers -- vbi -- this email is protected by a digital signature: http://fortytwo.ch/gpg
signature.asc
Description: This is a digitally signed message part