Re: Bayes Autolearn Threshold - different scoring?

Kris Deugau 11 Mar 2005 20:08:32 -0000

[EMAIL PROTECTED] wrote:
> I'm sure that's the problem. Here's a different sample spam, minus
> the bayes score (which isn't counted on the autolearn body tests,
> correct?)


Correct.  But keep in mind that the autolearn process actually uses
different scores.

>  2.2 RCVD_HELO_IP_MISMATCH  Received: HELO and IP do not match, but
> should

>From scoreset 3 (2.178);  autolearn will use set 1 (score: 0.618)

>  3.0 DATE_IN_FUTURE_12_24   Date: is 12 to 24 hours after Received:
> date

Set 1 score is 2.329.

>  1.2 RCVD_NUMERIC_HELO      Received: contains an IP address used for
> HELO

Set 1 score is 1.531.

>  2.7 FORGED_YAHOO_RCVD      'From' yahoo.com does not match
> 'Received' headers

Set 1 score is 2.174.

All together, that's well over the minimum 3 points from headers...  but
no body score.

> No body hits there... So basically, I'm getting what I want from the
> headers, and from what bayes already knows. How do I tweak the
> thresholds that the autolearner uses, for example, either setting the
> body threshold to 0 or eliminating that check entirely?

Hack the code.  There's no option I've heard of, and nothing noted in
the man page IIRC to allow that.

> I realize this might produce
> unwanted results, so I'd probably give it a week or so initial
> experiment.

I don't know how the current setup was decided on, but I'd imagine that
other methods have been tried - for general use, the 3+3 minimum in the
distributed SA is probably ideal.  For some specific mail streams
(yours, perhaps?)  this may not be optimal and may need to be tweaked.

-kgd
-- 
Get your mouse off of there!  You don't know where that email has been!

Re: Bayes Autolearn Threshold - different scoring?

Reply via email to