Yo!

I just thought some might be interested - here are my custom rules, and
my custom scores. I get about 1 false positive per week, out of 2000
mails or so (mostly mailing lists).

> ok_languages                  bs ca cz da de en es fr gd it sv tr
> 
> # I assume OE to always add a Message-Id: header
> meta vbi_FAKE_OE              MSG_ID_ADDED_BY_MTA_3 && USER_AGENT_OE
> describe vbi_FAKE_OE  mail apparently coming from OE but without Message-Id

Admittedly, this one fires almost never (11 out of >5000). But no false
positives either

> header vbi_BOGOFILTER         X-Bogosity =~ /^Yes/
> describe vbi_BOGOFILTER               mail was tagged as spam by bogofilter
>
> header vbi_SUBJECT_EXCL               Subject =~ /!$/
> describe vbi_SUBJECT_EXCL     Subject header ends in !

This one obviously gives the occasional false positive. Depends highly
on the style of communication you normally have.

> body vbi_DEBIAN                       /Debian/
> describe vbi_DEBIAN           Occurence of the word 'Debian'

The Debian mailing list footers don't say Debian with the capital D, so
this is a very good one for me.

> header vbi_FOXMAIL            X-mailer =~ /FoxMail/
> describe vbi_FOXMAIL          Compensate for various foxmail oddities

Mostly, the problem is that FoxMail Base64-encodes the mail body.

> header vbi_X_ROT              exists:X-Rot
> describe vbi_X_ROT    X-Rot header used to uniquely identify spam mails

As discussed recently - I'll have yet to see if this catches anything for me.

> # scores for my own tests
> score vbi_BOGOFILTER          3.0

With a 5M ham and 1M spam database, I find bogofilter is good enough
that I may want to rise that to 4.0, and award a -0.5 or so score to the
bogofilter-non-spam mails.

> score vbi_DEBIAN              -2.0
> score vbi_FAKE_OE             1.5
> score vbi_FOXMAIL             -3.0
> score vbi_SUBJECT_EXCL                0.9

This one is probably too high.

> score vbi_X_ROT                       4.0
> 
> # modified scores for spamassassin tests

I arrive at these by looking at the false negatives and false positives
only. No sophisticated analysis at all.

I might add that I do not often receive HTML mail; also, I know only few
people at AOL/hotmail/yahoo.

> score BALANCE_FOR_LONG_20K      -0.308  # -0.708
> score BALANCE_FOR_LONG_40K      -0.023  # -0.123
> score CTYPE_JUST_HTML         1.207   # 0.407
> score DATE_IN_PAST_06_12        0.348   # 0.448
> score FROM_ENDS_IN_NUMS         0.693   # 0.893
> score FORGED_HOTMAIL_RCVD     2.479   # 1.479
> score FORGED_YAHOO_RCVD       2.252   # 1.352
> score HTML_50_70              1.105   # 0.305
> score HTML_FONT_COLOR_BLUE    0.605   # 0.205
> score HTML_FONT_COLOR_RED       0.715   # 0.315
> score HTML_FONT_FACE_ODD        1.025   # 0.325
> score HTML_WITH_BGCOLOR       1.417   # 0.317
> score LINES_OF_YELLING                0.412   # 0.212
> score MAILTO_WITH_SUBJ                0.919   # 0.419
> score MAILTO_WITH_SUBJ_REMOVE 1.101   # 0.601
> score MIME_EXCESSIVE_QP         1.477   # 0.977
> score MIME_LONG_LINE_QP         1.024   # 0.324
> score MISSING_MIMEOLE         0.801   # 0.501
> score NO_REAL_NAME              1.185   # 1.285
> score NOSPAM_INC                -0.111  # -0.211
> score PGP_SIGNATURE_2           -2.708  # -0.708
> score PGP_SIGNATURE             -2.506  # -0.506
> score PRIORITY_NO_NAME                1.123   # 1.023
> score RAZOR_CHECK               2.040   # 2.640
> score RCVD_IN_DSBL              3.050   # 3.250
> score RCVD_IN_OSIRUSOFT_COM     0.180   # 0.380
> score RCVD_IN_RELAYS_ORDB_ORG   0.410   # 0.610
> score RCVD_IN_RFCI              1.780   # 2.280
> score REMOVE_IN_QUOTES                0.603   # 0.403
> score REMOVE_PAGE               1.206   # 0.706
> score SUBJ_ALL_CAPS             0.983   # 0.483
> score SUBJECT_IS_LIST           -0.317  # -0.217
> score SUPERLONG_LINE          0.059   # 0.009
> score X_LOOP                  -0.033  # -0.233
> score X_MAILING_LIST          -0.102  # -0.302
> score X_OSIRU_DUL               0.320   # 0.620
> score X_OSIRU_DUL_FH            0.260   # 0.360
> score X_OSIRU_OPEN_RELAY        2.220 # 2.720
> score X_OSIRU_SPAM_SRC          2.030   # 2.730

One of the main problems were all the DNS blacklists - I really had more
than just the occasional false positive there.

Oh, yes: I use the 2.43 debian packaged spamassassin version.

cheers
-- vbi

-- 
this email is protected by a digital signature: http://fortytwo.ch/gpg

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to