The proper usage of the Bayes filter is very simple: feed spam as spam
and ham as ham. All of your mail. Don't care for content that might be
mis-learned in your eyes: it will not be mis-learned. Don't try be
smarter than the filter. The only exception is bounce-messages: don't
feed them at all.
Why not? I've found that to be a wonderful way to discard third-party
backscatter without dropping the legitimate notices regarding mail my
customers actually sent (to the wrong place, mind you...)
If you feed third-party backscatter as spam, you let the Bayes filter
know that the standard bounce texts like "...unable to deliver..." is a
spam indicator. If some day you get a real bounce of a legitimate mail
you wrote, it may end up in the spam folder as FP. On the other hand, if
you learn bounces as ham, you mistrain the filter on the spam words, if
it was a spam mail that bounced. So the advice to not learn bounces at all.
For better backscatter handling, look into the VBounce plugin of
SpamAssassin. This is the short description:
# If you use this, set up procmail or your mail app to spot the
# "ANY_BOUNCE_MESSAGE" rule hits in the X-Spam-Status line, and move
# messages that match that to a 'vbounce' folder.
#
# You should also add 'whitelist_bounce_relays' lines, describing the
names of
# your own outgoing mail relays, like so:
#
# whitelist_bounce_relays dogma.boxhost.net
#
# This is used to 'rescue' legitimate bounce messages that were generated in
# response to mail you really *did* send. If you don't do this, the
# "BOUNCE_MESSAGE" rule will not fire. See 'perldoc VBounce.pm' for more
# details.
It is activated via /etc/mail/spamassassin/v320.pre.