On Fri, 2010-08-20 at 20:54 +0200, Karsten Bräckelmann wrote: > Because it depends. Some lists are suitable for deep-parsing. Some are > not. > > > Moreover, IMHO you are barking up the wrong tree. In your OP you said, a > message has been *rejected* by your SMTP. Yet, you are focusing entirely > on the RCVD_IN_BL_SPAMCOP_NET and RCVD_IN_SORBS_WEB hits. Which by > itself won't even push the score above the default spam threshold. > > Thus, very vital but left out parts to the puzzle are, (a) which rules > triggered in addition to them, and (b) at what threshold does your SMTP > reject a message? > > The combined score of these rules is no where even close to a sensible > rejection limit. Whatever else the message tripped on, it accounts for > the lions-share.
Just to back up my claim with numbers, here are the scores for both 3.2 and 3.3 branches. Minimally edited for readability. $ egrep 'RCVD_IN_(BL_SPAMCOP_NET|SORBS_WEB)' 3.[23]/rules/50_scores.cf 3.2/rules/50_scores.cf: score RCVD_IN_BL_SPAMCOP_NET 0 2.188 0 1.960 3.2/rules/50_scores.cf: score RCVD_IN_SORBS_WEB 0 1.117 0 0.619 3.3/rules/50_scores.cf: score RCVD_IN_BL_SPAMCOP_NET 0 1.246 0 1.347 3.3/rules/50_scores.cf: score RCVD_IN_SORBS_WEB 0 0.614 0 0.770 As you can see, even the aging 3.2 rule-set with Bayes disabled scores these at ~3.3 -- the worst possible combination, and yet still some way to go to cross the spam threshold of 5.0. Enabling Bayes, or using the latest stable SA release, only increases the buffer to be considered spammy. These numbers have been optimized with a spam threshold of 5.0 (at the time of their creation) -- to *minimize* false classification [1], while considering FPs more severe than FNs. With that in mind, a sensible reject limit is 8.0 or even higher [2]. In which case the remaining hits account for >4.7 -- or in other words, (almost) would have pushed the message in question over the spam threshold, even without those RCVD_IN_* hits. [1] It is impossible to eliminate FPs and FNs at the same time. In your OP you mentioned a single rejected message, right? ;) [2] Based on experience and years of discussion on this list. -- char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}