On Fri, 2011-02-18 at 15:53 -0800, Adam Katz wrote: > > Ah, good one. Though unfortunately, and I hate to admit that, both our > > rules will never match. The # hash needs to be escaped... *sigh* > > > > [/:?\#] > > > > Or just ignore it by leaving it out. It's pretty rare, anyway. > > Hash (#), like At (@) and sometimes Dollar ($), has an inconsistent > behavior of the SA parser. When in doubt, escape it, but I believe it > is correctly parsed when delimited with m''
Not on this machine. I used tilde as RE delimiter, but just checked with a single tick, too. Hash needs to escaped in both cases. There definitely is inconsistent behavior with @, IIRC due to Perl treating it as a variable in some cases. I believe with @, it is safe to "always be in doubt" and just escape it regardless. The hash, though, I believe consistently will be treated as a comment. $ spamassassin --lint --cf="uri FOO m'\.tld(?:[/:?#]|$)'" [28717] warn: config: invalid regexp for rule FOO: m'\.tld(?:[/:?: missing or invalid delimiters Almost every rule definition in the docs repeats the following statement like a mantra. That's what makes it so embarrassing. ;) "Note: as per the header tests, # must be escaped (\#) or else it is considered the beginning of a comment." > The issue with $ is moot in m//, but m'foo$' and several other > punctuation-based delimiters trigger various obscure perl variables, > which I believe include $' $& $` $+ ... A workaround is to use \Z > (which is usually the same thing) or (?:$) or a different delimiter. Oh, really? That would be news to me, I figured the Perl parser would be smarter than to mix RE delimiters like that. Are you positive about this? And btw, there also is a $/ variable in Perl. -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}