Hi, On Thu, Jun 13, 2013 at 9:55 PM, John Hardin <jhar...@impsec.org> wrote: > On Thu, 13 Jun 2013, Amir 'CG' Caspi wrote: > >> Lately, I've been getting hit with a LOT of this type of spam: >> >> http://pastebin.com/HD0rNdxU >> >> Not all of it is identical in format, but there seems to be one thing in >> common: they include lots of random garbage inside either CSS or in HTML >> comments. All of this gets ignored by the HTML parser and doesn't display, >> but is nevertheless in the raw source. The example above includes both >> types: non-parsing garbage in the CSS header, and an HTML comment at the >> end. >> >> I wonder, can a rule be created that basically looks for incredibly long >> HTML comments (like, multi-KB length comments), and/or looks in the CSS for >> long sequences of garbage? > > http://ruleqa.spamassassin.org/20130613-r1492572-n/STYLE_GIBBERISH/detail
John, I've just tried with your latest, and his sample doesn't hit STYLE_GIBBERISH. Any suggestions? Also, can you explain which are the relevant percentages on the ruleqa page that are most useful? Is it the aggregate value, which shows this rule appears in about 0.0022 percent ham and 0.1895 percent spam? Thanks, Alex