Dragoncrest wrote: > > >Though his friends and family getting scored sounds very possibly like > >some Bayes corruption going on because of the false negative > >autolearn(ing) -- not a good thing. > >Granted though, as the scoring from friends and family was not posted, > >Bayes may not have had anything to do with it. > > Hmm, would it be possible to set my filtering rules to default > levels and let SA learn all over again? I've got no problems resetting > things back to zero, redoing configs etc rather than trying to retrain
By, "setting things back to zero," I assume you mean defaults. Probably not a bad idea. Setting scores too high, giving false positives could have caused auto-learning ham as spam. But I wouldn't think this would lead to the false negatives, with auto-learning spam as ham -- that is, I don't think, in Bayes analysis, 'is spam' or 'is not spam' implies anything about 'is ham' or 'is not ham' Though I suppose there could be some problems just because there would be more erroneous tokens going into Bayes' statistical analysis, and fewer non-erroneous tokens going into the analysis -- every false negative means one less true positive going into the analysis. Again though, in short, if Bayes learned stuff incorrectly then that needs to be un-done. If keeping it from happening again means going back to defaults, then you should do it. All my scores are the defaults, and my installation is working quite well, including Bayes which does quite well. I've gotten so I only feed Bayes spam that slips through with less than BAYES_99. I assume you are auto-learning, and that you don't have a corpus saved up from which you can simply retrain. But if you do have such, you'll need to go through it and make sure what you think is ham, is ham, and so for spam as well. Then unlearn whatever false positives/negatives you find by manually feeding to sa-learn with the correct ham/spam parameter. Otherwise -- if you don't have the mail to relearn -- I suppose you'll have to delete your Bayes database, and get on with auto-learning again, after reseting your scores. It's not that scores can't be tweaked, but have to be carefull. Of course, tweaked scores being the problem, or part of the problem is only an assumption. There's also been a report about an auto-learn bug -- just a couple of mails earlier -- and this could be part of the problem, with tweaked scores making it worse. I don't auto-learn, so can't report on any experience with it, but perhaps there's an argument for manually feeding Bayes there. If you do manually train, don't use someone elses corpus -- Bayes works best with the training population coming from what it will be seeing under normal operation -- someone elses' ham, may be what you would call spam, etcetera. Bryan > it. I'd rather start with a clean slate and then debug it from there as it > tends to be much easier to break a habit when it's new than when it's > fairly well entrenched. > > ------------------------------------------------------- > This SF.net email is sponsored by: IBM Linux Tutorials. > Become an expert in LINUX or just sharpen your skills. Sign up for IBM's > Free Linux Tutorials. Learn everything from the bash shell to sys admin. > Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click -- What is a poet? An unhappy man who hides deep anguish in his heart, but whose lips are so formed that when the sigh and cry pass through them, it sounds like lovely music. - (Soren Kierkegaard - Either/Or) http://www.wecs.com/content.htm This signature file is generated by Pick-a-Tag ! Written by Jeroen van Vaarsel http://www.google.com/search?hl=en&ie=ISO-8859-1&q=pick-a-tag ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk