I've noticed a lot of spam lately that is using invalid html tags embedded between the letters of the body text, for example,
A<!alkdf>s s<!alkdf>ee<!alkdf>n on O<!alkdf>p<!alkdf>rah.
I searched around on the list but couldn't find any references to this approach. Anyone else seeing these spams? Is there any rule that covers invalid tags?
There's a bug open on it and it's been discussed many times on the list. Some of the code in 2.60-cvs takes care of some of this, but the bug isn't completely dead AFAIK.
See the 5/23 thread "[SAtalk] How are unknown tags handled ?" in which Stuart Gall suggested this rule
# Look for invalid HTML comment tags
rawbody __HTML_FALSE_COMMENT_TAG /<![^-]{2}/
meta HTML_FALSE_COMMENT __HTML_FALSE_COMMENT_TAG && MIME_HTML_ONLY
describe HTML_FALSE_COMMENT Message contains invalid HTML tags
score HTML_FALSE_COMMENT 1.5
Here's the bug entry:
http://bugzilla.spamassassin.org/show_bug.cgi?id=1960
The thread "[SAtalk] On filtering out invalid tags, decode %-characters, canoncial input" circa 5/25/03 is also related.
------------------------------------------------------- This SF.NET email is sponsored by: eBay Great deals on office technology -- on eBay now! Click here: http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk