On Tue, Sep 02, 2003 at 08:53:50AM -0700, Ron Gilbert wrote: > >>I read here that bayes is only turned on after > >>it learned from at least 200 spams AND 200 hams. > >>That number could be more. It only starts to be > >>efficient after you got say 1000 of both. > > Can someone explain to me why SA won't start using bayes until it's seen > several hundred messages? I found that to be very annoying and > confusing when I first installed SA. > > My first experience with bayes was with POPFile and it used it from the > first message. There was a day or so of getting a lot of FP's and FN's, > but then it settled down and work perfectly. > > SA was just frustrating, and maybe that had more to do with the lack of > feedback about things. I remember looking at the output of sa-learn and > it would tell me how many spam/ham it had seen, but that number would > often not increase with more learning. I assume that was because it had > not seen anything new. Once again, more feedback would have been nice. > Or just let it start filtering from the get go. Is there a downside to > that?
Yep -- it's not reliable to do that unattended. Basically, most Bayesian filters assume you'll be sitting down with it for the first few days, dutifully relearning FPs and FNs. SA can't make that assumption because it may be installed site-wide, creating bayes dbs for all users individually from auto-learn data. Given that the user then may be nontechnical, the admin may not have set up a feedback-to-learner mechanism, etc., it's better to be conservative by default. If you are running it for personal use and are happy to retrain on errors during the initial phase, go ahead and change the "bayes_min_ham_num" and "spam_num" settings as mentioned in the FAQ. --j. ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk