On Wed, Aug 27, 2003 at 04:59:42PM +0200, Morten Kjeldgaard wrote: > What can be done to plug this hole?
I think that the goal here is to not add invisible words to the bayesian database. Step one is to detect which words are invisible. You could simply look for white words, but the next thing you know is a spam with a black ground. Even comparing the background with the foreground color might not be sufficient because a spammer could use say #ffffff as background and #fffeff as foreground - still getting an effectively invisible paragraph. Other tricks might be to put the words way out of the screen, or use an extremely small font. Or the words could be part of a construct that is never displayed at all, not limited to HTML comments. In order to detect that you'd need a fullfledged HTML decoder... this is going to eat a lot of cpu :( -- Carlo Wood <[EMAIL PROTECTED]> ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk