On Dienstag, 9. Mai 2006 23:01 Bowie Bailey wrote:
> Hmm... If you are training Bayes, and all of your ham is in English,
> then what does Bayes do with the Chinese ham your customers get?

Nothing. But you won't get a SPAM report from bayes if the e-mail is 
chinese and you never feed chinese language e-mail. So no FPs.

> True, spam is spam.  It's the vast differences in ham that I am more
> worried about.  Our customers are salesmen for the most part, so they
> are constantly sending and receiving marketing type emails.  For us,
> marketing stuff is almost always considered spam.  I think this would
> cause a problem with false positives for our customers if I train
> Bayes based on our idea of ham and spam.

The important thing is that you should *never* feed to bayes something 
that *could* be a legit e-mail. Most people seem to make that error. I 
do NOT feed SPAM nor HAM that could be a legit mail.

Just those nigerian who want to give you some million $ because you are 
so nice, or those lotteries where you won a lot but before you have to 
pay, the very good jobs a lot of people seem to offer where you can 
earn 5000$ for only 3 hours of work and so on.

No chance this could be HAM for anybody (with at least some brain, but 
anyway you have to protect such people from themselves *g*). The same 
for feeding HAM: Give it only food that *is legit e-mail*, not some 
which could be.

Remember: 10 good SPAM and HAM are better than 200 where 5% are wrong.

Another good thing: Since I help with mass-checks, I found that of my 
6000 SPAMs, I had about 4 or 5 which I had to delete (but unlearn 
before), as they were mistakes. That's the advantage you get back when 
running mass-checks.

mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660/4156531                          .network.your.ideas.
// PGP Key:   "lynx -source http://zmi.at/zmi3.asc | gpg --import"
// Fingerprint: 44A3 C1EC B71E C71A B4C2  9AA6 C818 847C 55CB A4EE
// Keyserver: www.keyserver.net                 Key-ID: 0x55CBA4EE

Attachment: pgp7wTVFG6Tpn.pgp
Description: PGP signature

Reply via email to