Just set up the Bayesian component of SpamAssassin (version 2.55). The man page for sa-learn states that "Autolearning is enabled by default", but it doesn't seem to learn from most e-mails I receive.
My 'spam' mail folder is incoming mail that SA has given a 10 or higher score to. There is no other way mail would end up in that folder other than having been processed by spamc: % grep '^From ' -c Mail/spam 5 % bin/sa255/bin/sa-learn --spam --mbox Mail/spam Learned from 5 messages. As you can see, it had no record of any of the spams (that SpamAssassin itself identified) that were recieved in the last few hours. Often when I run sa-learn on the spam folder, it will learn "n-1" or so, instead of the total message count, so it does seems to learn a few, but I would estimate merely one out of ten. I've observed similar behavior by watching my bayes database timestamps when mail comes in. There's no further information on the auto-learning in the sa-learn or spamassassin man pages, or on the web page. What gives? Also: are you aware that 80%-probable spam is assigned a significantly higher default score (5.3) than 99%-probable (4.0)? Genetic algorithm or no, that doesn't seem statistically healthy. If giving 90-99%-probable spam an EQUAL or higher score than 80-90%-probable spam receives is causing false positives, wouldn't that point to a flaw in the Bayes filtering theory or implementation? Non sequitur test ideas: Presently the uppercase and HTML-tag percentage of a message is checked, but has anyone tried a rule to detect the HTML comment percentage? And how about an HTML_FONT_COLOR_WHITE? I've seen a lot of spams hiding non-spammy words in a <font color=ffffff> block. Please be so kind as to Cc me on any replies. /Jeremy -- Jeremy M. Dolan <mailto:[EMAIL PROTECTED]> <http://jmd.us/> PGP: 1024D/3C68A1BA 9470 210C A476 FFBB 6D11 0223 0D1C ABFC 3C68 A1BA ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk