I run a small ISP and have installed SpamAssassin to stop spam. It catches a lot of spam. It's especially good at filtering out the worst, most offensive mail, but a good deal of spam still gets through the filter, even after a user's bayes db gets big enough to start adding the bayes tests.
I've noticed that a lot of the spam that makes it to my inbox has scores of between 4 and 4.9 -- mail that has scored positive on at least 5-10 rules, and that SA should be able to file as spam without worrying that it's a false positive, but doesn't. The flaw, IMO, is the additive scoring. Sure, a lot of these rules triggered in isolation should only add .3 or .1 to the final score. But the probability that an item is spam should go sky high when, say, five substantially different .2 and .1 rules all came back positive for a single message. The statistics should bear this out as a useful test. Without ditching the current scoring altogether in favor of a multiplicative model (a la bayes), what if there were a post-analysis scoring step that just took into account the total number of positive rules (or rule families, if there is such a division)? Instead of looking at each test as though it occurred in isolation, this can put all the tests into sharper context without throwing away a lot of scoring code. I'm sure perceptron can come up with a more accurate gradation, but I imagine it would look something like this: 0 rules - 0.0 1 rule - 0.0 2 rules - 0.0 3 rules - 0.0 4 rules - 1.0 5 rules - 2.0 6 rules - 3.0 7-10 rules - 4.0 10+ rules - 5.0 Thoughts? -tom