On Wed, Aug 27, 2003 at 04:59:42PM +0200, Morten Kjeldgaard wrote:
> What can be done to plug this hole? 

I think that the goal here is to not add invisible
words to the bayesian database.

Step one is to detect which words are invisible.
You could simply look for white words, but the next
thing you know is a spam with a black ground.
Even comparing the background with the foreground
color might not be sufficient because a spammer
could use say #ffffff as background and #fffeff as
foreground - still getting an effectively invisible
paragraph.

Other tricks might be to put the words way out of
the screen, or use an extremely small font.  Or the
words could be part of a construct that is never
displayed at all, not limited to HTML comments.
In order to detect that you'd need a fullfledged
HTML decoder... this is going to eat a lot of cpu :(

-- 
Carlo Wood <[EMAIL PROTECTED]>


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to