The proper usage of the Bayes filter is very simple: feed spam as spam and ham as ham. All of your mail. Don't care for content that might be mis-learned in your eyes: it will not be mis-learned. Don't try be smarter than the filter. The only exception is bounce-messages: don't feed them at all.

Why not? I've found that to be a wonderful way to discard third-party backscatter without dropping the legitimate notices regarding mail my customers actually sent (to the wrong place, mind you...)

If you feed third-party backscatter as spam, you let the Bayes filter know that the standard bounce texts like "...unable to deliver..." is a spam indicator. If some day you get a real bounce of a legitimate mail you wrote, it may end up in the spam folder as FP. On the other hand, if you learn bounces as ham, you mistrain the filter on the spam words, if it was a spam mail that bounced. So the advice to not learn bounces at all.

For better backscatter handling, look into the VBounce plugin of SpamAssassin. This is the short description:

# If you use this, set up procmail or your mail app to spot the
# "ANY_BOUNCE_MESSAGE" rule hits in the X-Spam-Status line, and move
# messages that match that to a 'vbounce' folder.
#
# You should also add 'whitelist_bounce_relays' lines, describing the names of
# your own outgoing mail relays, like so:
#
#   whitelist_bounce_relays       dogma.boxhost.net
#
# This is used to 'rescue' legitimate bounce messages that were generated in
# response to mail you really *did* send.  If you don't do this, the
# "BOUNCE_MESSAGE" rule will not fire.  See 'perldoc VBounce.pm' for more
# details.


It is activated via /etc/mail/spamassassin/v320.pre.

Reply via email to