Kevin Sullivan wrote: > On Jun 2, 2005, at 8:27 PM, Matt Kettler wrote: > >> If one's wrong, they are ALL wrong. >> >> SA's rule scores are evolved based on a real-world test of a >> hand-sorted corpus of fresh spam and ham. The whole scoreset is >> evolved simultaneously to optimize the placement pattern. >> >> Of course, one thing that can affect accuracy is if some spams are >> accidentally misplaced into the ham pile it can cause some heavy score >> biasing to occur. A little bit of this is unavoidable, as human >> mistakes happen, but a lot of it will cause deflated scores and a lot >> of FNs. > > > The rule scores are optimized for the spam which was sent at the time > that version of SA was released (actually, at the time the rule scoreset > was calculated). Since then, the static SA rules have become less > useful since spammers now write their messages to avoid them. The only > rules which spammers cannot easily avoid are the dynamic ones: bayes > and network checks (RBLs, URIBLs, razor, etc). > > On my systems, I raise the scores for the dynamic tests since they are > the only ones which hit a lot of today's spam. >
Very true. Most of the static tests (ie: body rule sets like antidrug) spammers quickly adapt to after a SA release, and they loose some effectiveness over time. However, some dynamic tests have too high a FP rate to have their scores raised very much. Before raising a score, at least check the S/O ratio in the STATISTICS*.txt file. For example RAZOR2_CHECK has a S/O somewhere near 98% (splitting the difference between set1 at 97.6% and set3 at 98.2%). This means that about 2% of the emails matched by this rule were in the nonspam pile. That may not sound bad, but 98% (2% FP rate) is a factor of 20 times worse than 99.9% (0.1% FP rate). (compare the results for RCVD_IN_XBL or URIBL_OB_SURBL to RAZOR2_CHECK, for example)