Hello Matt,

Wednesday, October 15, 2003, 12:23:21 PM, you wrote:

MVG> Does the Bayesian filtering stop working if the
MVG> database becomes too lopsided?

It stops working if the database becomes corrupted, which
can happen in various ways. The first thing you should do is
look in your .spamassassin directory. You should have the
following files related to Bayes:

bayes_journal
bayes_msgcount
bayes_seen
bayes_toks


If you have any other files beginning with bayes_ but with
another extension (such as a lock file), that is a sign of
trouble.

I have tried various approaches to repairing that sort of
problem, but the only thing that really worked for me was to
delete all the bayes files and start over from scratch.

MVG> Is there a way to check how much ham and spam the
MVG> database has

Use this shell command:

spamassassin -D --lint

It will run a number of tests for Bayes, including a line
that tells you the ham/spam info. (on my server, this
currently reads:

debug: bayes corpus size: nspam = 7333, nham = 1281


If there is either insufficient spam or ham you will get a
specific error message telling you that.

Bayes periodically expires old tokens on its own; I am not
sure, but believe that it will NOT expire either ham or spam
if that leaves an insufficient corpus.  While the accuracy
of the database might suffer if there is an imbalance, Bayes
should continue to run as long as it has the minimum
requisite of each.

MVG> and what can I do to insure that the
MVG> Bayesian filtering continues to function.

Just monitor it. I've had problems in the past myself, and
basically nothing seems to have really explained either the
source of my problem or how to fix it.  I think the problem
is that it's possible (at least in versions 2.54/2.55) for
the salearn program to continue to run even after
encountering an error (such as lock file problem) and good
data can get overwritten with bad or incomplete data. I
don't think this is a frequent occurrence, but maybe it
happens from time to time with a heavy load of incoming mail
and more than one process trying to write to the database at
the same time.  That's just a guess in any case, which comes
mostly from the fact that on my system I also seem to see
lock file issues at the same time I have encountered
problems or issues with the Bayes database. This in turn
could result from particular memory limitation issues on the
system -- I don't seem to have the problem any more since I
did a server upgrade that afforded more memory.

-Abigail



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to