From: "Matt Kettler" <[EMAIL PROTECTED]>
Bart Schaefer wrote:
On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:
Besides.. If you want to make a mathematics based argument against me,
start by explaining how the perceptron mathematically is flawed. It
assigned the original score based on real-world data.
Did it? I thought the BAYES_* scores have been fixed values for a
while now, to force the perceptron to adapt the other scores to fit.
Actually, you're right..I'm shocked and floored, but you're right.
In SA 3.1.0 they did force-fix the scores of the bayes rules,
particularly the high-end. The perceptron assigned BAYES_99 a score of
1.89 in the 3.1.0 mass-check run. The devs jacked it up to 3.50.
That does make me wonder if:
1) When BAYES_9x FPs, it FPs in conjunction with lots of other rules
due to the ham corpus being polluted with spam. This forces the
perceptron to attempt to compensate. (Pollution always is a problem
since nobody is perfect, but it occurs to differing degrees).
-or-
2) The perceptron is out-of whack. (I highly doubt this because the
perceptron generated the ones for 3.0.x and they were fine)
-or-
3) The Real-world FPs of BAYES_99 really do tend to also be cascades
with other rules in the 3.1.x ruleset, and the perceptron is correctly
capping the score. This could differ from 3.0.x due to change in rules,
or change in ham patterns over time.
-or-
4) one of the corpus submitters has a poorly trained bayes db.
(possible, but I doubt it)
Looking at statistics-set3 for 3.0.x and 3.1.x there was a slight
increase in ham-hits for BAYES_99 and a slight decrease in spam hits.
3.0.x:
OVERALL% SPAM% HAM% S/O RANK SCORE NAME
43.515 89.3888 0.0335 1.000 0.83 1.89 BAYES_99
3.1.x:
OVERALL% SPAM% HAM% S/O RANK SCORE NAME
60.712 86.7351 0.0396 1.000 0.90 3.50 BAYES_99
Also to consider is set3 of 3.0.x was much closer to a 50/50 mix of
spam/nonspam (48.7/51.3) than 3.1.0 was (nearly 70/30)
What happens comes from the basic reality that Bayes and the other
rules are not orthogonal sets. So many other rules hit 95 and 99 that
the perceptron artificially reduced the goodness rating for these rules.
It needs some serious skewing to catch situations where 95 or 99 hit and
very few other rules hit. Those are the times the accuracy of Bayes is
needed the most. I've found, here, that 5.0 is a suitable score. I
suspect if I were more realistic 4.9 would be closer. But I still do
remember learning the score bias and being floored by it when I noticed
99 on some spams that leaked through with ONLY the 99 hit. I am speaking
of dozens of spams hit that way.
So far over several years I've found a few special cases that warrant
negative rules. That seems to be pulling the 99 rule's false alarm
rate down to "I can't see it." (I have, however, been tempted to generate
a BAYES_99p5 rule and a BAYES_99p9 rule to fine tune the scores up around
4.9 and 5.0.)
{^_