Re: Those "Re: good obfupills" spams (bayes scores)

jdow Sat, 29 Apr 2006 20:55:41 -0700

From: "Bart Schaefer" <[EMAIL PROTECTED]>


On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:

 In SA 3.1.0 they did force-fix the scores of the bayes rules,
particularly the high-end. The perceptron assigned BAYES_99 a score of
1.89 in the 3.1.0 mass-check run. The devs jacked it up to 3.50.

That does make me wonder if:
    1) When BAYES_9x FPs, it FPs in conjunction with lots of other rules
due to the ham corpus being polluted with spam.


My recollection is that there was speculation that the BAYES_9x rules
were scored "too low" not because they FP'd in conjunction with other
rules, but because against the corpus they TRUE P'd in conjunction
with lots of other rules, and that it therefore wasn't necessary for
the perceptron to assign a high score to BAYES_9x in order to push the
total over the 5.0 threshold.

The trouble with that is that users expect training on their personal
spam flow to have a more significant effect on the scoring.  I want to
train bayes to compensate for the LACK of other rules matching, not
just to give a final nudge when a bunch of others already hit.

I filed a bugzilla some while ago suggesting that the bayes percentage
ought to be used to select a rule set, not to adjust the score as a
component of a rule set.

<< jdow >> There is one other gotcha. I bet vastly different scores
are warranted for Bayes when run with per user training and rules
as compared to global training and rules.

{^_^}

Re: Those "Re: good obfupills" spams (bayes scores)

Reply via email to