jdow wrote: > From: "Bart Schaefer" <[EMAIL PROTECTED]> > > > > On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote: > > > In SA 3.1.0 they did force-fix the scores of the bayes rules, > > > particularly the high-end. The perceptron assigned BAYES_99 a > > > score of 1.89 in the 3.1.0 mass-check run. The devs jacked it up > > > to 3.50. > > > > > > That does make me wonder if: > > > 1) When BAYES_9x FPs, it FPs in conjunction with lots of > > > other rules due to the ham corpus being polluted with spam. > > > > My recollection is that there was speculation that the BAYES_9x > > rules were scored "too low" not because they FP'd in conjunction > > with other rules, but because against the corpus they TRUE P'd in > > conjunction with lots of other rules, and that it therefore wasn't > > necessary for the perceptron to assign a high score to BAYES_9x in > > order to push the total over the 5.0 threshold. > > > > The trouble with that is that users expect training on their > > personal spam flow to have a more significant effect on the > > scoring. I want to train bayes to compensate for the LACK of > > other rules matching, not just to give a final nudge when a bunch > > of others already hit. > > > > I filed a bugzilla some while ago suggesting that the bayes > > percentage ought to be used to select a rule set, not to adjust > > the score as a component of a rule set. > > There is one other gotcha. I bet vastly different scores are > warranted for Bayes when run with per user training and rules as > compared to global training and rules.
Ack! I missed the subject change on this thread prior to my last reply. Sorry about the duplication. I think it is also a matter of manual training vs autolearning. A Bayes database that is consistently trained manually will be more accurate and can support higher scores. -- Bowie