From: "John Rudd" <[EMAIL PROTECTED]>
On Jul 26, 2006, at 9:07 AM, Theo Van Dinter wrote:
On Wed, Jul 26, 2006 at 07:43:51AM -0700, John Rudd wrote:
When that score is developed, how is it decided that the scores have
settled? When a "95% of the spam in the corpus got ranked 5 or
higher"? 80%? 100%? That's the comparison I'm looking for.
It's a learning system, so it's done when it runs out of results to
learn
from. ;)
I have some bits on the perceptron in my presentation from AC 2004:
http://people.apache.org/~felicity/AC2004/ check out page 29.
Looking at the STATISTICS* files in the rules directory may be useful
too,
btw.
Hm. I have no such files in my rules directory. (I'm running 3.1.1)
Some 'ix distros have the SpamAssassin Tools in a separate package
file. (I'm not even sure FC4 has an SA Tools rpm. I don't see one
with a yum list. FC5 doesn't have them either.)
They are in the tar file that can be downloaded from the web site,
though. You want them. You also want the alternative sa-stats.pl
by Dallas Engelken. It currently seems to be hiding out at
http://www.rulesemporium.com/programs/sa-stats.txt. It's worth
the download. It breaks down individual rule scores nicely.
{^_^}