Hi, Martin - Thank you for your response. The original test was using a file arbitrarily named aa.html .. It still doesn't work with the rewrite you provided :/
> -----Original Message----- > From: Martin Gregorie [mailto:mar...@gregorie.org] > Sent: Friday, May 31, 2013 3:38 PM > To: users@spamassassin.apache.org > Subject: Re: Rule to scan for .html attachments? > > On Fri, 2013-05-31 at 14:45 -0400, Andrew Talbot wrote: > > I need it to fire on any HTML attachment. The modules are enabled. I > > can get it to pick up text/html, remember, but the problem is that it > > detects messages sent as HTML when it's set up like that. It doesn't > > detect plain-text messages, but it will flag plain-text messages with > > HTML files attached. > > > Well, that's exactly what your second rule won't do: it will only fire on the > header of an html attachment for a file that has one of a very restricted set > of filenames. As you haven't posted any example MIME header sets I can > only guess, but my guess is that none of the messages you've tried it against > have attachments with names that match the restriction. > > As I said before the rule can't work with the '^' in place, because that says > that the 'filename=....' string must be at the beginning of a line and NOT > preceded by any white space. Thats a harmful restriction because you never > see MIME headers like that. With the '^' removed the rule > becomes: > > header HTML_ATTACH_RULE_2 Content-Disposition =~ /filename\=\"[a- > z]{2}\.html\"/i > > which has a better chance of working. This version will only fire if the > filename associated with the attachment has precisely two alphabetic > characters plus a .html extension, i.e. it will fire on filename="aa.html" or > filename="ZZ.HTML" because the trailing 'i' makes it a caseless match, but it > won't fire on filename="cat.html" > or filename="x.html" because these don't have two character names and it > won't fire if the attachment follows the common Windows convention of > using a .htm extension. > > If you want the rule to fire on *any* HTML attachment it should be: > > header HTML_ATTACH_RULE_2 Content-Disposition =~ > /filename\=\".{0,30}\.html{0,1}\"/i > > which will match any filename with a .html or .htm extension (including > ".html" and ".htm"). > > Could I respectfully suggest that you learn about Perl regular expressions > before you try writing any more SA rules? SA rules are all based on using the > Perl flavour of regular expressions to match character strings in headers and > the message body. > > You could do a lot worse than getting a copy of "Programming Perl" by Larry > Wall, Tom Christiansen & Jon Orwant, published by O'Reilly. If there isn't one > in the firm's technical library, they should be willing to buy a copy. Its a > brick > of a book, but you only need to read "Chapter > 5: Pattern Matching" to write SA rules and in any case the rest of its > contents > will come in handy in future if anybody needs to write Perl programs or SA > extension modules. > > > Martin > > > >