Michael Monnerie wrote: > On Dienstag, 9. Mai 2006 17:14 Bowie Bailey wrote: > > I've considered that, but it won't work in our setup. This box > > scans our internal email as well as all of our customer's email. > > Since we are in an entirely different line of business from our > > customers, what we consider to be ham and spam will be quite > > different from theirs. If I could train it on both sets, it might > > work, but I don't have access to any of their emails for training. > > I believe that's a general mistake. I've got a server with many diff. > domains, some people working with china, others with brazil, many > different languages, and so on. With site wide bayes which is only > trained _by me_, I've not had a single complaint in years where bayes > was incorrect.
Hmm... If you are training Bayes, and all of your ham is in English, then what does Bayes do with the Chinese ham your customers get? > Real SPAM is really SPAM. For everybody. Those penis enlargements, > viagra and drug ads, and false job offers are really ever SPAM. And if > somebody wants to get those info about penis enlargement, he should > just look in his SPAM folder, it's not getting deleted anyway. True, spam is spam. It's the vast differences in ham that I am more worried about. Our customers are salesmen for the most part, so they are constantly sending and receiving marketing type emails. For us, marketing stuff is almost always considered spam. I think this would cause a problem with false positives for our customers if I train Bayes based on our idea of ham and spam. > If you are sane and try to not make mistakes with bayes, it works > phantastic. I've got about 6.000 spam & ham, and everyday I feed the > new SPAM to bayes for learning. > > Try it: keep some real SPAM, use site-wide bayes without auto-learn. > Feed at least 200 spam & ham to bayes, and train it every day. You > will be happy. I might give it a try. But, then again, based on some testing I just did, I might leave it the way it is. I'll include that info in a separate thread. -- Bowie