On Mon, 14 Jul 2014 13:24:10 -0600 Bob Proulx <b...@proulx.com> wrote:
> And since this appears to be at the global MTA stage in a milter > that it will always be less effective globally than an > individualized Bayes database. Not necessarily. We have a giant Bayes database based on feedback from our customers (it has tokens from about 3.6 million each of ham and spam) and it gave a 99% likelihood of spam when I fed it http://pastebin.com/Feete78K The key is to have a rich corpus of hand-trained mail for Bayes. Having individualized Bayes databases is much less important than most people think; in our experience, most people agree on what's ham vs. what's spam. The real win for individualized Bayes databases comes from people working in specialized fields where the jargon associated with the field is a strong ham indicator. In other words, individualized Bayes databases help quite a bit to detect ham, but don't help that much to detect spam. Regards, David.