Hi - On Sat, Nov 24, 2012 at 12:58:33PM -0500, Daniel Berlin wrote: > [...] > I'd love to see data on this. As others have pointed out, almost > every other open source project accepts html email. [...] > Do you have reason to believe our existing spam detection solution > will start to fail massively when presented with html email? [...]
Yes. I run a similar spamassassin setup at home as sourceware's, and it routinely lets through spam that is disguised in HTML. That is after all trivial to do - font size=1 color=white or somesuch gunk. Annoyingly, the spam's hidden bayes-countering filler goo shows up in its full html-to-text glory in a text-based MUA. > After all, if most of the HTML email is spam, something being HTML > email is a great signal for it. Dunno about "most", but "an uncomfortable amount" is right. > [...] > Note that *we* are currently rejecting multipart/alternative if it > contains text/html, even if it contains text/plain. > This is fairly obnoxious. See above. Spam filtering on HTML bodies is not very effective, unless one's a gmail. There is no mechanical way to ensure that the multipart alternative text/plain is equivalent -- and if it were, then it could just have been sent as is in the first place (were it not for MUA intransigence). - FChE