>> Can anyone explain why deleting the database seemed to fix it's >> learning? Is it just a corrupt database or something I am liable to hit >> again further down the track?
JM> It must have learned those Message-IDs before. It will not learn the same JM> message ID twice. OK, here's the deal. There is a file called bayes_toks where all the learned spammy stuff are stored. There is a file called bayes_seen which is where the Bayes sa-learn function checks first before learning, to make sure it is learning something new. Here's where things can go wrong: if something happens to corrupt the bayes_toks file it could get lost - at least in some circumstances, sa-learn will overwrite an old (big) bayes_toks file with just the new stuff being learned in a given session. (This happened to me once when there was an out-of-memory problem, and once when for some reason sa-learn could not access one of the files -- I run a nightly cron job to feed new spam to sa-learn, and it is at this stage that the problems occured.) I was able to fix my problems by recovering the old bayes_toks file from my mirrored backup drive, and REMOVING the other files from the directory (bayes_seen, etc.) - and running sa-learn --build Removing bayes_seen was absolutely essential to this process; otherwise Bayes would refuse to learn the new data. I think there is a problem because now I have an incomplete bayes_seen file, and so my bayes_toks file will end up relearning spam it already has. On the other hand, Bayes is working beautifully this way, perhaps because the corpus is simply large enough to handle the possible duplication of tokens. (Also, if the same spam is being sent repeatedly over time, that in itself is an indication that it is particularly likely to be spam - so it is possible that I have inadvertantly created a system more weighted on the basis of spam frequency and volume.) Anyway - if you aren't experiencing problems with the Bayes function, I wouldn't worry about the learning process -- but if you are, then you may be able to rectify things by removing the bayes_seen file and running a rebuild. (I reported my issue as a bug, because I think that sa-learn should abort when it runs into problems rather than overwrite and destroy essential files). -Abigail ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk