On Fri, 16 May 2014 11:24:29 -0700
Ian Zimmerman <i...@buug.org> wrote:

> On close inspection, I see that the hash-busting garbage appended is
> (faux) technical computing talk instead of the usual cookbooks or
> classical literature :-p  That is, scrambled Stack Overflow
> discussions and the like.  And of course that is what most of my ham
> is about, so it makes very good sense that Bayes gets confused.

Well, that can happen sometimes... but not that often in my experience.

> 5593          0  non-token data: nspam
> 6190          0  non-token data: nham

Ah, I have a larger corpus: 4,608,013 spams and 4,146,168 hams.  I suspect
that's why Bayes poisoning is not an issue for us.

Also, spammers will adjust their attack and put the poisonous stuff first,
obfuscated by a style="display: none" HTML attribute or similar.

Best to let Bayes work it out by itself, I think.

Regards,

David.

Reply via email to