On Thu, 2014-05-22 at 15:49 -0400, James B. Byrne wrote: > I am clearly missing something with these rules but I lack the experience to > see what it is: > > score RAW_BLANK_LINES_05 0.5 > rawbody RAW_BLANK_LINES_05 /(\r?\n){5,9}/i
Why is everyone trying to match empty lines these days? Must be spam I'm missing out on. ;) > I passed it to spamassassin from the command line with the above rules in > /etc/mail/spamassassin/local.cf and nothing was reported. I used an actual > message body from a spam message received and only the RAW_BLANK_LINES_05 test > is tripped even though the body of that message has 18 consecutive blank > lines, also consisting of nothing but \n characters. > > So what is it about the regexp I am using that I evidently do not understand? See the post Consecutive Newlines in Rawbody Rules as of a few minutes ago, follow-up to the Bayes refinement thread. In a nutshell: 12 or more consecutive newlines cannot be matched with rawbody rules. They get replaced by 2 newlines. There's another issue with your approach of different rules matching "up to n" occurrences and "more than n". The first will always match in addition, if the latter matches. If the desired behavior is mutually exclusive matching, you need meta rules actually encoding the math / logic. -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}