Justin,

I expected the HTML parser might deal with the comment block attack. Will it also deal with the "white-on-white text" variant? (you didn't include both scenarios from my original email, so I'm adding the other back in below).

I wrote previously:
If you strip HTML prior to bayes this could also be done by using a white-on-white text in tiny font tag prior to the bogus ham, with lots of newlines, and then switch back to a readable color and begin their marketing. The final message once processed by the MUA as an HTML message will appear as if it has only a couple blank lines at the top (because the font is small, and HTML will ignore the newlines) but will miss bayes entirely.

At 04:02 PM 11/21/2002 +0000, Justin wrote:

Matt Kettler said:

> As a counter argument of this, what about HTML messages being abused to
> bypass bayes when only looking at the top N lines? (note: think this is on
> the right track in principle, but I can see some resulting holes)
>
> The spammer could now bypass bayes by inserting a HTML comment at the
> beginning consisting of 200 bytes or 20 lines of ham, end the comment, and
> begin his spam message.

BTW we do have some very smart HTML parsing (thx Dan ;) which our Bayes
impl uses, so this will not be a prob for us.

--j.


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to