Am 20.02.2015 um 18:35 schrieb Kevin Miller:
When a fresh spam flood comes in, sometimes 50 or more of my users will get hit with the same message - just a different user in the To: line. When one trains the bayes database, is there a significant difference between training on all 50+ or just grabbing a few of the messages and training on them? Will bayes be more convinced of the spaminess of a particular message if it sees dozens rather than a couple?
surely, that's how bayes works * split the message in tokens * look inthe database how often each token exists in "spam" or "ham" http://en.wikipedia.org/wiki/Naive_Bayes_classifier
signature.asc
Description: OpenPGP digital signature