I expected the HTML parser might deal with the comment block attack. Will it also deal with the "white-on-white text" variant? (you didn't include both scenarios from my original email, so I'm adding the other back in below).
I wrote previously:
If you strip HTML prior to bayes this could also be done by using a white-on-white text in tiny font tag prior to the bogus ham, with lots of newlines, and then switch back to a readable color and begin their marketing. The final message once processed by the MUA as an HTML message will appear as if it has only a couple blank lines at the top (because the font is small, and HTML will ignore the newlines) but will miss bayes entirely.
At 04:02 PM 11/21/2002 +0000, Justin wrote:
Matt Kettler said:
> As a counter argument of this, what about HTML messages being abused to
> bypass bayes when only looking at the top N lines? (note: think this is on
> the right track in principle, but I can see some resulting holes)
>
> The spammer could now bypass bayes by inserting a HTML comment at the
> beginning consisting of 200 bytes or 20 lines of ham, end the comment, and
> begin his spam message.
BTW we do have some very smart HTML parsing (thx Dan ;) which our Bayes
impl uses, so this will not be a prob for us.
--j.
------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk