Re: negative scores for spam

Jeff Mincy Fri, 20 Mar 2009 14:58:48 -0700

   From: Jesse Stroik <jstr...@ssec.wisc.edu>
   Date: Fri, 20 Mar 2009 16:14:39 -0500

   Hoover Chan wrote:
   > The threshold was set to 6.6 (cf. required=6.6). The message this was 
attached to was very definitely junk. This kind of situation got me curious 
about the whole thing where any positive spam score is set as the threshold but 
seeing junk mail coming in with negative scores.

   You are getting negative scores for auto white list and for bayes_00. 
   It's a matter of taste and what you believe makes sense, but I don't 
   consider bayes to be all that accurate (since there are methods for 
   defeating bayes, poisoning bayes, etc).  As such, I don't allow Bayes to 
   assign negative scores or positive scores within a couple of points of 
   the threshold.  You can do so by assigning scores like this:

   score BAYES_00  0
   score BAYES_05  0
   score BAYES_20  0
   score BAYES_40  0

Yow.  The negative scoring bayes rules are extremely reliable when well
trained.  Ham messages are not trying to evade the filter.  Defeating
bayes with poison is mostly a myth.  The random garbage might work the
first time but not the second time as long as you are training these
messages as spam.  If you are getting lots of BAYES_00 hits on spam
then the problem is almost certainly incorrect training where spam
messages were incorrectly learned as ham.


   I also disable AWL since a lot of spam, especially the stuff most likely 
   to be tested against spamassassin, will like use known good email 
   addresses from your domain as the "from" address.  This is fairly likely 
   to hit on the AWL.

Yow again.   AWL uses email address and the IP address.  So forged
email addresses used in spam is not going to use the same EMAIL+IP
pair as legitimate email using the same email address.
   
   Again, it's just a matter of taste and it all depends on how you've set 
   up your scoring.  I'm pretty cautious to ensure there aren't false 
   positives as that would decrease the value of spamassassin greatly for 
   us, but I otherwise avoid AWL and Bayes negative scores.
   
   If you sent us a copy of the spam, we could test it and show you what 
   should be hitting.

Use pastebin instead.

-jeff

Re: negative scores for spam

Reply via email to