On January 09, 2004 01:48 pm, JR wrote: > At 10:37 PM (-0800) 1/8/2004 (Thursday), Robert Menschel wrote: > >Running these against my corpus, I find > >WORDWORD -- 4212s/14h of 87289 corpus (70035s/17254h) > >WORDWORD2 -- 4205s/12h of 87289 corpus (70035s/17254h) > > Probably some stupid questions, but I'm having trouble finding > documentation to explain proper Bayes Feeding Techniques: > > Do I have to keep feeding Bayes ham as I feed it spam?
For best results, feed all human-identified HAM and SPAM. > If if have to keep > feeding it ham, what ratio of ham/spam should I be feeding it? Does the > ratio matter beyond the initial feeding to kick Bayes into action? In general, you don't have to worry too much about the ratio, unless your HAM/SPAM ratio is extremely weird. Just feed as much as you can. > When picking ham to feed it, what kinds of things should I consider/avoid > when trying to find enough ham? Do the messages have to come from > off-site, or can they mostly be internal mail between the same domain or > between domains hosted on the same server? What are the potential > problems/benefits of using mailing list messages as ham? If you use SA to filter both internal and external mail, then yes, learn from both. But if you only filter external mail through SA, then learning from internal mail MAY be counter productive. > Finally, if I am writing my own custom rules, how do I determine what score > to give them? I see mentions of "running against the corpus" like the one > above, but how do you DO that, and once you do what exactly is it TELLING > you? dunno, something about mass-check? Pedro -- Fresco's Discovery: If you knew what you were doing you'd probably be bored. ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk