Hello Matt, Wednesday, October 15, 2003, 12:23:21 PM, you wrote:
MVG> Does the Bayesian filtering stop working if the MVG> database becomes too lopsided? It stops working if the database becomes corrupted, which can happen in various ways. The first thing you should do is look in your .spamassassin directory. You should have the following files related to Bayes: bayes_journal bayes_msgcount bayes_seen bayes_toks If you have any other files beginning with bayes_ but with another extension (such as a lock file), that is a sign of trouble. I have tried various approaches to repairing that sort of problem, but the only thing that really worked for me was to delete all the bayes files and start over from scratch. MVG> Is there a way to check how much ham and spam the MVG> database has Use this shell command: spamassassin -D --lint It will run a number of tests for Bayes, including a line that tells you the ham/spam info. (on my server, this currently reads: debug: bayes corpus size: nspam = 7333, nham = 1281 If there is either insufficient spam or ham you will get a specific error message telling you that. Bayes periodically expires old tokens on its own; I am not sure, but believe that it will NOT expire either ham or spam if that leaves an insufficient corpus. While the accuracy of the database might suffer if there is an imbalance, Bayes should continue to run as long as it has the minimum requisite of each. MVG> and what can I do to insure that the MVG> Bayesian filtering continues to function. Just monitor it. I've had problems in the past myself, and basically nothing seems to have really explained either the source of my problem or how to fix it. I think the problem is that it's possible (at least in versions 2.54/2.55) for the salearn program to continue to run even after encountering an error (such as lock file problem) and good data can get overwritten with bad or incomplete data. I don't think this is a frequent occurrence, but maybe it happens from time to time with a heavy load of incoming mail and more than one process trying to write to the database at the same time. That's just a guess in any case, which comes mostly from the fact that on my system I also seem to see lock file issues at the same time I have encountered problems or issues with the Bayes database. This in turn could result from particular memory limitation issues on the system -- I don't seem to have the problem any more since I did a server upgrade that afforded more memory. -Abigail ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk