From: "Jim Maul" <[EMAIL PROTECTED]>

Péntek Imre wrote:
Jim Maul wrote:
I've upped the scores on almost all bayes rules here because history has
shown it to be incredibly accurate here.
Yes. BTW so far I've got no FP but still get false negatives with score 3.5, BAYES_99, using this database:
[5816] dbg: bayes: corpus size: nspam = 2757, nham = 1403
Built from scratch by myself, still growing.
As I have so big database there's very little possibility of mistaken bayesian score, but as I've built this database from scratch, I can also state that the same stands for little bayesian databases too. So I will use score 5.1 for BAYES_99, and still suggest to use this in the SA distribution too. Thanks for helping me anyways.


If you are getting false negatives with 3.5 then you need to find a way
to get more rules to hit.  My average spam score here is 16.1 which is
way over my 5.0 threshold.  The trick is to increase the distance
between your average spam and ham scores as much as possible and then
you can run with a higher spam threshold.  If you have spam not getting
tagged, you should increase rules that trigger, not lower your threshold.

Are you using network tests, razor, surbl, add on rules from sare, etc?


<< Jim, if a rule has a history of hitting wrong once in 1000 or 10000
times the score should be moved up from what the perceptron shows modulo
your mail flow. At 1000 messages a day finding one or even two hams in
the spam folder because of a rule scored too high is not severely annoying.
You can discover it. You can fix it. This goes for a low volume email
system with per user rules and Bayes. For a largish ISP different rules
of thumb must apply. Still, a really REALLY good rule can score pretty
high before it reveals itself as a problem with false negatives and you
have to lower the score a bit. BAYES_99 on a well trained system is one
such rule. Tweak scores gently until your tolerance for false positives
is exceeded. Then back off a bit, maybe even two notches.

{^_^}

Reply via email to