On Nov 8, 2018, at 12:20 PM, RW <rwmailli...@googlemail.com> wrote: > > these emails don't contain a valid HTML mime section. They contain a bogus > html section that doesn't > start with the separator defined in the top-level Content-Type header.
Sorry, that is totally my fault. In the spample, I was trying to sanitize any possible identifying information and I ended up over-sanitizing. I sanitized the separator string for text/plain and at the end, but I missed the one for text/html. So, bottom line -- the HTML mime section is actually valid in the original email. The spample is invalid because of my overzealousness/paranoia/idiocy. If the HTML section is valid, as it appears to be ... then the HTML should be decoded. And yet, these emails are hitting BAYES_00 or BAYES_05 despite the spammy HTML text. So, does this mean my Bayes DB is borked? Or does it mean something else? In looking through my recent spams, almost all of them are hitting either BAYES_50 or lower... almost none are hitting BAYES_99 (this includes the ones identified as spam for other scoring reasons). This is despite the training. So I'm thinking maybe my Bayes DB is not working properly... unless somehow the Bayes poison is actually working. Though I doubt the latter since discussions on here have asserted many times that "poison" doesn't work. But, I don't know why the DB would stop scoring properly all of a sudden, after working fine for years... Thanks. --- Amir