Running these against my corpus, I find WORDWORD -- 4212s/14h of 87289 corpus (70035s/17254h) WORDWORD2 -- 4205s/12h of 87289 corpus (70035s/17254h)
Probably some stupid questions, but I'm having trouble finding documentation to explain proper Bayes Feeding Techniques:
Do I have to keep feeding Bayes ham as I feed it spam? If if have to keep feeding it ham, what ratio of ham/spam should I be feeding it? Does the ratio matter beyond the initial feeding to kick Bayes into action?
When picking ham to feed it, what kinds of things should I consider/avoid when trying to find enough ham? Do the messages have to come from off-site, or can they mostly be internal mail between the same domain or between domains hosted on the same server? What are the potential problems/benefits of using mailing list messages as ham?
Finally, if I am writing my own custom rules, how do I determine what score to give them? I see mentions of "running against the corpus" like the one above, but how do you DO that, and once you do what exactly is it TELLING you?
TIA
--JR
------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk