Hello Pedro, Friday, January 9, 2004, 11:58:55 AM, you wrote:
>> Probably some stupid questions, but I'm having trouble finding >> documentation to explain proper Bayes Feeding Techniques: >> >> Do I have to keep feeding Bayes ham as I feed it spam? PS> For best results, feed all human-identified HAM and SPAM. Agreed, with the exception of this list and any personal email that discusses the technicalities of spam (eg: if you discuss spam topics such as the Banned CD, that email should NOT be learned as ham even though to you it is). >> If if have to keep >> feeding it ham, what ratio of ham/spam should I be feeding it? Does the >> ratio matter beyond the initial feeding to kick Bayes into action? PS> In general, you don't have to worry too much about the ratio, unless your PS> HAM/SPAM ratio is extremely weird. Just feed as much as you can. Agreed. ham/spam = 1/20 is not too bad. I'm not so sure about 1/1000. >> When picking ham to feed it, what kinds of things should I consider/avoid >> when trying to find enough ham? Do the messages have to come from >> off-site, or can they mostly be internal mail between the same domain or >> between domains hosted on the same server? What are the potential >> problems/benefits of using mailing list messages as ham? PS> If you use SA to filter both internal and external mail, then yes, learn from PS> both. But if you only filter external mail through SA, then learning from PS> internal mail MAY be counter productive. Only thing I exclude is SA-Talk and similar emails. Everything else goes through sa-learn, including email I receive through my email client from accounts that don't have SA capabilities. >> Finally, if I am writing my own custom rules, how do I determine what score >> to give them? I see mentions of "running against the corpus" like the one >> above, but how do you DO that, and once you do what exactly is it TELLING >> you? PS> dunno, something about mass-check? My ideas, at least as of a month ago, are documented at http://www.exit0.us/index.php/RM_RuleScoring Bob Menschel ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk