Re: SpamAssassins bayes mechanism and message headers

Jeff Mincy Wed, 18 Mar 2009 13:26:28 -0700

   From: Greg Troxel <g...@ir.bbn.com>
   Date: Wed, 18 Mar 2009 15:33:31 -0400

   Jeff Mincy <j...@delphioutpost.com> writes:

   >    From: Matt Kettler <mkettler...@verizon.net>
   >    Date: Tue, 17 Mar 2009 21:30:02 -0400
   >
   >    > shouldn't SpamAssassins bayes mechanism just ignore the complete
   >    > message header and just look at the body?
   >    > This seems useful in my opinion.
   >    It seems like a very misguided idea to me.
   >    
   >    Is there any reason to think headers make bad tokens?
   >    Do you have any test data showing this improves your bayes accuracy?
   >
   > Yes - I think some headers make extremely bad tokens for bayes, for
   > example the X-Mailer/User-Agent headers.   40% of the spam I get

   I think I'm having a similar problem, where I get spam via a
   mailinglist, and bayes gives the spam credit for having similar headers
   to the ham which arrives on the list.  I'm not so concerned about
   including the headers as they arrive at the list server, but all the
   headers added from receipt by the list server seem inappropriate.

   I'll try bayes_ignore_header.


Scanning mailing list email is more trouble that it's worth.  It can
be done, but you have to be very motivated and it is a lot of work to
maybe catch a few mailing list spam messages.

Bayes needs to ignore any headers and any special footer tokens added by
the mailing list postings.  You need to extend trusted_networks to the
mailing list so that various tests are done on the submitter instead of
the mailing list.  DCC should be whitelisted for most mailing lists
since the email messages are bulk.  Any automatic reporting needs to be
turned off.  I'm sure there are other things that I'm forgetting.

If the mailing list has reasonably good spam filtering then just skip
running SpamAssassin.

-jeff

Re: SpamAssassins bayes mechanism and message headers

Reply via email to