Here's a bunch of rules I've put together, and partially pulled from here. I thought I'd share and try to get some feedback on some of them, especially the first one.
Mike #Text outside of the last </HTML> tag #Example: </HTML> afejg32 #Unfortunately a new line breaks this. If there is a way to # to include newlines in this rule, it would catch a lot more. rawbody MK_BAD_HTML_1 /\<\/html\>\s{0,50}\S+\d{0,10}/i describe MK_BAD_HTML_1 Bad HTML form. Content after closing HTML tag score MK_BAD_HTML_1 1.8 #Using paragraphs and spaces to break lines. #Example: <p align="center"> </p> OR <p>nbsp; rawbody MK_BAD_HTML_2 /\<p\s{0,50}?\S{0,50}\>\s{0,50}?\ \;/i describe MK_BAD_HTML_2 Bad HTML form. Breaking lines with paragraphs. score MK_BAD_HTML_2 0.1 #Very uncommon that any one would do this rawbody MK_BAD_HTML_3 /\t\<\/html\>/i describe MK_BAD_HTML_3 Bad HTML form. Tabbed your closing html tag. score MK_BAD_HTML_3 0.6 #Check for a beginning HTML tag <HTML> rawbody __MK_HTML_TAG_START /\<html/i #Check for a closing HTML tag </html> rawbody __MK_HTML_TAG_END /\<\/html\>/i #Check to see if the HTML message is made correctly. Seeing a lot of SPAM that isn't meta MK_BAD_HTML_4 HTML_MESSAGE && !__MK_HTML_TAG_START && !__MK_HTML_TAG_END describe MK_BAD_HTML_4 Bad HTML form. Doesn't have beginning or closing HTML tags. score MK_BAD_HTML_4 0.4 #Same as MK_BAD_HTML_4, except we just check for a beginning tag without and end tag meta MK_BAD_HTML_5 HTML_MESSAGE && __MK_HTML_TAG_START && !__MK_HTML_TAG_END describe MK_BAD_HTML_5 Bad HTML form. Has a beginning HTML tag and no end tag. score MK_BAD_HTML_5 0.3 #Same as MK_BAD_HTML_4, except we just check for an end tag without and beginning tag meta MK_BAD_HTML_6 HTML_MESSAGE && !__MK_HTML_TAG_START && __MK_HTML_TAG_END describe MK_BAD_HTML_6 Bad HTML form. Has an ending HTML tag and no beginning tag. score MK_BAD_HTML_6 0.3 #This takes care of <!asde>, but excludes # <!DOCTYPE HTML ... rawbody __MK_BAD_HTML_7 /\<![a-zA-CE-Z]/ #This takes care of tags that don't exist such as <zebra> #The last / is in there so it doesn't freak out about closing tags. #<KBD> is a valid tag, but I don't believe we'll see it in email so k is not in the list. #Added in to not pickup <[EMAIL PROTECTED]> rawbody __MK_BAD_HTML_8 /\<[^abcdefhilmopstuv\/[EMAIL PROTECTED],80}\>/i #This takes care of closing tags that don't exist such as </zebra> rawbody __MK_BAD_HTML_9 /\<\/[^abcdefhilmopstuv]/i #8/4/2003 Added in due to MS Office <?xml:blahblah> tag rawbody __MK_GOOD_HTML_1 /\<\??xml/i #The next three are a combo of the above three. meta MK_BAD_HTML_10 HTML_MESSAGE && __MK_BAD_HTML_7 describe MK_BAD_HTML_10 Bad HTML form. HTML Tag <!blah> that does not exist used. score MK_BAD_HTML_10 1.8 meta MK_BAD_HTML_11 HTML_MESSAGE && __MK_BAD_HTML_8 && !__MK_GOOD_HTML_1 describe MK_BAD_HTML_11 Bad HTML form. HTML beginning tag that does not exist used. score MK_BAD_HTML_11 0.3 meta MK_BAD_HTML_12 HTML_MESSAGE && __MK_BAD_HTML_9 describe MK_BAD_HTML_12 Bad HTML form. HTML closing tag that does not exist used. score MK_BAD_HTML_12 0.7 #Yahoo mail doesn't use beginning or closing html tags header __MK_FROM_YAHOO_1 Received =~ /mail.yahoo.com/i header __MK_FROM_YAHOO_2 From =~ /[EMAIL PROTECTED]/ meta MK_VALID_LOOKING_YAHOO_MAIL_1 MK_BAD_HTML_4 && __MK_FROM_YAHOO_1 && __MK_FROM_YAHOO_2 describe MK_VALID_LOOKING_YAHOO_MAIL_1 Offsetting Yahoo! mail penalties. score MK_VALID_LOOKING_YAHOO_MAIL_1 -0.5 ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk