Hi, Martin -

Thank you for your response. The original test was using a file arbitrarily 
named aa.html .. It still doesn't work with the rewrite you provided :/ 





> -----Original Message-----
> From: Martin Gregorie [mailto:mar...@gregorie.org]
> Sent: Friday, May 31, 2013 3:38 PM
> To: users@spamassassin.apache.org
> Subject: Re: Rule to scan for .html attachments?
> 
> On Fri, 2013-05-31 at 14:45 -0400, Andrew Talbot wrote:
> > I need it to fire on any HTML attachment. The modules are enabled. I
> > can get it to pick up text/html, remember, but the problem is that it
> > detects messages sent as HTML when it's set up like that. It doesn't
> > detect plain-text messages, but it will flag plain-text messages with
> > HTML files attached.
> >
> Well, that's exactly what your second rule won't do: it will only fire on the
> header of an html attachment for a file that has one of a very restricted set
> of filenames. As you haven't posted any example MIME header sets I can
> only guess, but my guess is that none of the messages you've tried it against
> have attachments with names that match the restriction.
> 
> As I said before the rule can't work with the '^' in place, because that says
> that the 'filename=....' string must be at the beginning of a line and NOT
> preceded by any white space. Thats a harmful restriction because you never
> see MIME headers like that. With the '^' removed the rule
> becomes:
> 
> header HTML_ATTACH_RULE_2 Content-Disposition =~  /filename\=\"[a-
> z]{2}\.html\"/i
> 
> which has a better chance of working. This version will only fire if the
> filename associated with the attachment has precisely two alphabetic
> characters plus a .html extension, i.e. it will fire on filename="aa.html" or
> filename="ZZ.HTML" because the trailing 'i' makes it a caseless match, but it
> won't fire on filename="cat.html"
> or filename="x.html" because these don't have two character names and it
> won't fire if the attachment follows the common Windows convention of
> using a .htm extension.
> 
> If you want the rule to fire on *any* HTML attachment it should be:
> 
> header HTML_ATTACH_RULE_2 Content-Disposition =~
> /filename\=\".{0,30}\.html{0,1}\"/i
> 
> which will match any filename with a .html or .htm extension (including
> ".html" and ".htm").
> 
> Could I respectfully suggest that you learn about Perl regular expressions
> before you try writing any more SA rules? SA rules are all based on using the
> Perl flavour of regular expressions to match character strings in headers and
> the message body.
> 
> You could do a lot worse than getting a copy of "Programming Perl" by Larry
> Wall, Tom Christiansen & Jon Orwant, published by O'Reilly. If there isn't one
> in the firm's technical library, they should be willing to buy a copy. Its a 
> brick
> of a book, but you only need to read "Chapter
> 5: Pattern Matching" to write SA rules and in any case the rest of its 
> contents
> will come in handy in future if anybody needs to write Perl programs or SA
> extension modules.
> 
> 
> Martin
> 
> 
> 
> 


Reply via email to