Am 20.03.2015 um 11:40 schrieb Matus UHLAR - fantomas:
On 20.03.15 09:30, Reindl Harald wrote:why would you want poems or cooking recipes trained as spam?and why not?
i think i have explained it often enough now
they still may contain stuff that helps differ spam from ham, you never know...
if you have users only with one or two languages the danger may be not that high - we have users from more than 60 countries and way less ham-samples in most languages than spam-samples
i removed *10 MEGABTYE* poison and many of it are texts in spanish, italian and so on - that affects legit mail in a negative way
once I have trained spam report as ham, extracted the original spam and trained that one as spam. It helped me to differ spam from spam reports
well, and finally that's exactly what i do by remove poison - extract the original spam instead train innocent content
why *did you* extract something and not train the whole messages? likely for the same reason i clean up samples
signature.asc
Description: OpenPGP digital signature