Look at: http://useast.spamassassin.org/doc/Mail_SpamAssassin_Conf.html#learning%20op tions
bayes_ignore_header header_name If you receive mail filtered by upstream mail systems, like a spam-filtering ISP or mailing list, and that service adds new headers (as most of them do), these headers may provide inappropriate cues to the Bayesian classifier, allowing it to take a ``short cut''. To avoid this, list the headers using this setting. Example: bayes_ignore_header X-Upstream-Spamfilter bayes_ignore_header X-Upstream-SomethingElse An example: http://www.stearns.org/doc/spamassassin-setup.current.html#autoreporting --Larry > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:spamassassin-talk- > [EMAIL PROTECTED] On Behalf Of Ross Vandegrift > Sent: Monday, January 19, 2004 2:53 PM > To: [EMAIL PROTECTED] > Subject: [SAtalk] Bayes mis-learning problem > > Hey everyone, > > We're currently coping with a false-positive crisis that's > sweeping our email with 2.60, mostly due to scores of the Bayes filter. > We run SA site-wide on an incoming MX host, so individual users do not > have access to train the Bayes database. Moreover, our primary client > program is Pegasus Mail for DOS, which provides no real way to get raw > messages out unmodified (it hoses CR/LF, forces line wraps, and cat's > MIME parts together). > > So I'm going through some of our Bayes tokens trying to decide > if I should dump the current database and start over. I've noticed > really bad things like this: > > 0.892 381 112 1069183901 HTo:[EMAIL PROTECTED] > 0.905 75 19 1069183901 HTo:[EMAIL PROTECTED] > 0.997 17 0 1069183901 HTo:[EMAIL PROTECTED] > > This looks really horrible! Just by virtue of my boss's email having a > "To: [EMAIL PROTECTED]", it'll almost certainly be tagged as spam. The > database is trained with nham=13685 and nspam=5652. Autolearning is > enabled and has default threshholds. > > This is alarming at first. But when I think about it, and I realize > that most of us get more spam than ham - Bayes is right. Unfortuantely, > that's really, really the wrong thing to do. Is there a way to excempt > some headers from processing? ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk