On 3/21/12 5:06 AM, Matus UHLAR - fantomas wrote:
there are two problems when requiring users to manually learn on everythhing.
- it's more work to implement
- it's more work for users to do the training.

On 21.03.12 08:38, Michael Scheidell wrote:
and, if 95% of the users are using microsoft exchange, exchange will horribly mangle the headers, and the body, even changing the actual encoding.
so, what would you manually learn?

Mangling data by exchange is a big. problem when trying to filter spam in front of it. I see two ways to avoid this problem: - use spam server for exchange. We use one from GFI, with quite good results. - you can use spam filter in front of exchange, store copies on it and learn from them. However, you will probably be the only one who can train spamfilter in such case.

you actually _can_ train from messages that went through exchange, but mangling by exchange will somehow blur the results and lower bayes accuracy. --
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
BSE = Mad Cow Desease ... BSA = Mad Software Producents Desease

Reply via email to