Re: Those "Re: good obfupills" spams

jdow Sat, 29 Apr 2006 20:46:47 -0700

From: "Matt Kettler" <[EMAIL PROTECTED]>

List Mail User wrote:

Matt Kettler replied:

John Tice wrote:

Greetings,
This is my first post after having lurked some. So, I'm getting these
same "RE: good" spams but they're hitting eight rules and typically
scoring between 30 and 40. I'm really unsophisticated compared to you
guys, and it begs the question––what am I doing wrong? All I use is a
tweaked user_prefs wherein I have gradually raised the scores on
standard rules found in spam that slips through over a period of time.
These particular spams are over the top on bayesian (1.0), have
multiple database hits, forged rcvd_helo and so forth. Bayesian alone
flags them for me. I'm trying to understand the reason you would not
want to have these type of rules set high enough? I must be way over
optimized––what am I not getting?

BAYES_99, by definition, has a 1% false positive rate.


If we were to presume a uniform distribution between a estimate of
99% and 100%, then the FP rate would be .5%, not 1%.

You're right Paul, my bad..

But again, I don't care if it's 0.01%. The question here is "is jacking
up the score of BAYES_99 to be greater than required_hits a good idea".
The answer is "No, because BAYES_99 is NOT a 100% accurate test. By
definition it does have a non-zero FP rate.


I run AT 5.0. When I see my first false alarm solely from BAYES_99
I will reduce it slightly. I know what theory says. I also know that
BAYES_99 alone captures more spam than it has ever captured ham for
false imprisonment.

 And for large sites
(i.e. 10s or thousands or messages a day or more), this may be what occurs;
But what I see and what I assume many other small sites see is a very much
non-uniform distribution;  From the last 30 hours, the average estimate (re.
the value reported in the "bayes=xxx" clause) for spam hitting the BAYES_99
rule is .999941898013269 with about two thirds of them reporting bayes=1 and
a lowest value of bayes=0.998721756590216.

Yes, that's to be expected with Chi-Squared combining.

While SA is quite robust largely because of the design feature that
no single reason/cause/rule should by itself mark a message as spam, I have
to guess that the FP rate that the majority of users see for BAYES_99 is far
below 1%.  From the estimators reported above, I would expect that I would
have seen a .003% FP rate for the last day plus a little, if only I received
100,000 or so spam messages to have been able to see it:).

True, but it's still not nearly zero. Even in the corpus testing, which
is run by "the best of the best" in SA administration and maintenance,
BAYES_99 matched 0.0396% of ham, or 21 out of 53,091 hams. (Based on
set-3 of SA 3.1.0)


And it is scored LESS than BAYES_95 by default. That's a clear signal
that the theory behind the scoring system is a little skewed and needs
some rethinking.

Given we are dealing with user who doesn't even understand why you might
not want this set "high enough", I would expect the level of
sophistication in bayes maintenance

Besides.. If you want to make a mathematics based argument against me,
start by explaining how the perceptron mathematically is flawed. It
assigned the original score based on real-world data. Not our vast over
simplifications. You should have good reason to question its design
before second guessing it's scoring based on speculation such as this.


When it can give BAYES_99 a score LOWER than BAYES_95 it clearly has
a conceptual problem. (It also indicates that automatic Bayes filter
training has its own conceptual flaws.)

I don't change the scoring from the defaults, but if people were to
want to, maybe they could change the rules (or add a rule) for BAYES_99_99
which would take only scores higher than bayes=.9999 and which (again with
a uniform distribution) have an expected FP rate of .005% - than re-score
that just closer (but still less) than the spam threshold,


I'd agree.. However, the OP has already made BAYES_99 > required_hits.
Bad idea. Period.


5.0 is, admittedly marginal. 6 or 7 is not a good idea. Not enough rules
exist that will pull it back down. (Thinking on that I suspect there are
some SARE rules that should lower the score slightly when they are not
hit.)

{^_^}

Re: Those "Re: good obfupills" spams

Reply via email to