Philip Prindeville wrote: > Theo Van Dinter wrote: > >> The malformed content-type header caused this. The MIME part isn't >> "text/html", it's "text/htmlcontent-transfer-encoding8bitrn". So the best >> that can happen is URLs are parsed out of the text. >> >> > > Ok, we'll here's a new rule: > > # incompetent spamware programmers... > header L_INCOMPETENT ALL =~ /\\r\\n\n?/ > describe L_INCOMPETENT Insufficiently convincing headers > (malformed) > score L_INCOMPETENT 6.0 > >
Remove the trailing \n?, it's redundant. The ? allows it to not be present, thus it would only matter if something else was to follow. It is ALWAYS redundant to end a normal regex in an item qualified with ? or *. Regexes substring matches, so they implicitly allow anything to follow the matched text. /foo/ always has the same matches as /foo.*/ and /foo.?/ and /foo1?/. All of the above will match "foo" "fool" and "somefoolishperson" That said, your rule above will match something with "\r\n" anywhere in it, even the middle of the text. Perhaps you want something more like: header L_INCOMPETENT ALL =~ /\\r\\n\s?$/ The $ forces end-of-line match, and the \s? allows any single whitespace to be inserted before the actual EOL.