Tomoyuki Sakurai wrote: >> Hopefully SA2.60 would solve it.
Gordon Cormack <[EMAIL PROTECTED]> writes: > The version of 2.60 that I have sort of works in detecting obfuscated html. > > It *does* detect words split apart by html comments. > > It *does not* detect words split apart by bogus tags. This is non-trivial to do accurately. By accurately, I mean in a way that does not hit legitimate (if non-standard) HTML. > It *does not* reconstruct obfuscated html for the benefit of the > feature rules or the bayesian classifier. (I've been tempted to > pipe the html through lynx ...) I'm not sure what you mean here. > It *does not* remove text with fontcolor == backgroundcolor for > the benefit of the bayesian classifier. Yeah, this still needs to be tested bit. I'm not sure whether it would make a significant difference. 2.60-cvs does have fairly decent detection of invisible and low contrast fonts and they contribute to the message score. Unfortunately, there's a fair amount of poor-written legitimately HTML that does this too, so the score will probably only be about 1 to 2. -- Daniel Quinlan anti-spam (SpamAssassin), Linux, and open http://www.pathname.com/~quinlan/ source consulting (looking for new work) ------------------------------------------------------- This SF.Net email is sponsored by: INetU Attention Web Developers & Consultants: Become An INetU Hosting Partner. Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission! INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk