From: "Bart Schaefer" <[EMAIL PROTECTED]>
On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:
In SA 3.1.0 they did force-fix the scores of the bayes rules, particularly the high-end. The perceptron assigned BAYES_99 a score of 1.89 in the 3.1.0 mass-check run. The devs jacked it up to 3.50. That does make me wonder if: 1) When BAYES_9x FPs, it FPs in conjunction with lots of other rules due to the ham corpus being polluted with spam.
My recollection is that there was speculation that the BAYES_9x rules were scored "too low" not because they FP'd in conjunction with other rules, but because against the corpus they TRUE P'd in conjunction with lots of other rules, and that it therefore wasn't necessary for the perceptron to assign a high score to BAYES_9x in order to push the total over the 5.0 threshold. The trouble with that is that users expect training on their personal spam flow to have a more significant effect on the scoring. I want to train bayes to compensate for the LACK of other rules matching, not just to give a final nudge when a bunch of others already hit. I filed a bugzilla some while ago suggesting that the bayes percentage ought to be used to select a rule set, not to adjust the score as a component of a rule set. << jdow >> There is one other gotcha. I bet vastly different scores are warranted for Bayes when run with per user training and rules as compared to global training and rules. {^_^}