Michael Monnerie wrote:
> On Dienstag, 9. Mai 2006 17:14 Bowie Bailey wrote:
> > I've considered that, but it won't work in our setup.  This box
> > scans our internal email as well as all of our customer's email.
> > Since we are in an entirely different line of business from our
> > customers, what we consider to be ham and spam will be quite
> > different from theirs. If I could train it on both sets, it might
> > work, but I don't have access to any of their emails for training.
> 
> I believe that's a general mistake. I've got a server with many diff.
> domains, some people working with china, others with brazil, many
> different languages, and so on. With site wide bayes which is only
> trained _by me_, I've not had a single complaint in years where bayes
> was incorrect.

Hmm... If you are training Bayes, and all of your ham is in English,
then what does Bayes do with the Chinese ham your customers get?

> Real SPAM is really SPAM. For everybody. Those penis enlargements,
> viagra and drug ads, and false job offers are really ever SPAM. And if
> somebody wants to get those info about penis enlargement, he should
> just look in his SPAM folder, it's not getting deleted anyway.

True, spam is spam.  It's the vast differences in ham that I am more
worried about.  Our customers are salesmen for the most part, so they
are constantly sending and receiving marketing type emails.  For us,
marketing stuff is almost always considered spam.  I think this would
cause a problem with false positives for our customers if I train
Bayes based on our idea of ham and spam.

> If you are sane and try to not make mistakes with bayes, it works
> phantastic. I've got about 6.000 spam & ham, and everyday I feed the
> new SPAM to bayes for learning.
> 
> Try it: keep some real SPAM, use site-wide bayes without auto-learn.
> Feed at least 200 spam & ham to bayes, and train it every day. You
> will be happy.

I might give it a try.  But, then again, based on some testing I just
did, I might leave it the way it is.  I'll include that info in a
separate thread.

-- 
Bowie

Reply via email to