On Fri, 2011-02-18 at 15:53 -0800, Adam Katz wrote:
> > Ah, good one. Though unfortunately, and I hate to admit that, both our
> > rules will never match. The # hash needs to be escaped... *sigh*
> > 
> >   [/:?\#]
> > 
> > Or just ignore it by leaving it out. It's pretty rare, anyway.
> 
> Hash (#), like At (@) and sometimes Dollar ($), has an inconsistent
> behavior of the SA parser.  When in doubt, escape it, but I believe it
> is correctly parsed when delimited with m''

Not on this machine. I used tilde as RE delimiter, but just checked with
a single tick, too. Hash needs to escaped in both cases.

There definitely is inconsistent behavior with @, IIRC due to Perl
treating it as a variable in some cases. I believe with @, it is safe to
"always be in doubt" and just escape it regardless. The hash, though, I
believe consistently will be treated as a comment.

  $ spamassassin --lint --cf="uri FOO m'\.tld(?:[/:?#]|$)'"
  [28717] warn: config: invalid regexp for rule FOO:
    m'\.tld(?:[/:?: missing or invalid delimiters

Almost every rule definition in the docs repeats the following statement
like a mantra. That's what makes it so embarrassing. ;)

 "Note: as per the header tests, # must be escaped (\#) or else it is
  considered the beginning of a comment."


> The issue with $ is moot in m//, but m'foo$' and several other
> punctuation-based delimiters trigger various obscure perl variables,
> which I believe include  $'  $&  $`  $+   ... A workaround is to use \Z
> (which is usually the same thing) or (?:$) or a different delimiter.

Oh, really? That would be news to me, I figured the Perl parser would be
smarter than to mix RE delimiters like that. Are you positive about
this?

And btw, there also is a $/ variable in Perl.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to