Matt Kettler wrote: >Philip Prindeville wrote: > > >>Theo Van Dinter wrote: >> >> >> >>>The malformed content-type header caused this. The MIME part isn't >>>"text/html", it's "text/htmlcontent-transfer-encoding8bitrn". So the best >>>that can happen is URLs are parsed out of the text. >>> >>> >>> >>> >>Ok, we'll here's a new rule: >> >># incompetent spamware programmers... >>header L_INCOMPETENT ALL =~ /\\r\\n\n?/ >>describe L_INCOMPETENT Insufficiently convincing headers >>(malformed) >>score L_INCOMPETENT 6.0 >> >> >> >> > >Remove the trailing \n?, it's redundant. The ? allows it to not be present, >thus >it would only matter if something else was to follow. > >It is ALWAYS redundant to end a normal regex in an item qualified with ? or *. > >Regexes substring matches, so they implicitly allow anything to follow the >matched text. > >/foo/ always has the same matches as /foo.*/ and /foo.?/ and /foo1?/. > >All of the above will match "foo" "fool" and "somefoolishperson" > > >That said, your rule above will match something with "\r\n" anywhere in it, >even >the middle of the text. > >Perhaps you want something more like: > >header L_INCOMPETENT ALL =~ /\\r\\n\s?$/ > >The $ forces end-of-line match, and the \s? allows any single whitespace to be >inserted before the actual EOL. > > > >
I know that the \n? is optional, but I didn't know if the ALL pattern contained \n only between the header lines, or after all of them (including the last one... since the \\r\\n could occur on the very last line, and it would make a different if \n was used a separator or a terminator of lines). I also thought that it should only match the case of \\r\\n\n, but when I tried that pattern against the test message, it failed. Not sure why. Still digging. Should it be legal to have \\r\\n in the Subject: line for instance? I don't see why not... -Philip