From: Greg Troxel <g...@ir.bbn.com> Date: Wed, 18 Mar 2009 15:33:31 -0400 Jeff Mincy <j...@delphioutpost.com> writes: > From: Matt Kettler <mkettler...@verizon.net> > Date: Tue, 17 Mar 2009 21:30:02 -0400 > > > shouldn't SpamAssassins bayes mechanism just ignore the complete > > message header and just look at the body? > > This seems useful in my opinion. > It seems like a very misguided idea to me. > > Is there any reason to think headers make bad tokens? > Do you have any test data showing this improves your bayes accuracy? > > Yes - I think some headers make extremely bad tokens for bayes, for > example the X-Mailer/User-Agent headers. 40% of the spam I get I think I'm having a similar problem, where I get spam via a mailinglist, and bayes gives the spam credit for having similar headers to the ham which arrives on the list. I'm not so concerned about including the headers as they arrive at the list server, but all the headers added from receipt by the list server seem inappropriate. I'll try bayes_ignore_header.
Scanning mailing list email is more trouble that it's worth. It can be done, but you have to be very motivated and it is a lot of work to maybe catch a few mailing list spam messages. Bayes needs to ignore any headers and any special footer tokens added by the mailing list postings. You need to extend trusted_networks to the mailing list so that various tests are done on the submitter instead of the mailing list. DCC should be whitelisted for most mailing lists since the email messages are bulk. Any automatic reporting needs to be turned off. I'm sure there are other things that I'm forgetting. If the mailing list has reasonably good spam filtering then just skip running SpamAssassin. -jeff