Am 20.02.2015 um 18:35 schrieb Kevin Miller:
When a fresh spam flood comes in, sometimes 50 or more of my users will get hit 
with the same message - just a different user in the To: line.  When one trains 
the bayes database, is there a significant difference between training on all 
50+ or just grabbing a few of the messages and training on them?  Will bayes be 
more convinced of the spaminess of a particular message if it sees dozens 
rather than a couple?

surely, that's how bayes works

* split the message in tokens
* look inthe database how often each token exists in "spam" or "ham"

http://en.wikipedia.org/wiki/Naive_Bayes_classifier




Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to