Kevin Sullivan wrote:
> On Jun 2, 2005, at 8:27 PM, Matt Kettler wrote:
> 
>> If one's wrong, they are ALL wrong.
>>
>> SA's rule scores are evolved based on a real-world test of a
>> hand-sorted corpus of fresh spam and ham. The whole scoreset is
>> evolved simultaneously to optimize the placement pattern.
>>
>> Of course, one thing that can affect accuracy is if some spams are
>> accidentally misplaced into the ham pile it can cause some heavy score
>> biasing to occur. A little bit of this is unavoidable, as human
>> mistakes happen, but a lot of it will cause deflated scores and a lot
>> of FNs.
> 
> 
> The rule scores are optimized for the spam which was sent at the time
> that version of SA was released (actually, at the time the rule scoreset
> was calculated).  Since then, the static SA rules have become less
> useful since spammers now write their messages to avoid them.  The only
> rules which spammers cannot easily avoid are the dynamic ones:  bayes
> and network checks (RBLs, URIBLs, razor, etc).
> 
> On my systems, I raise the scores for the dynamic tests since they are
> the only ones which hit a lot of today's spam.
> 

Very true. Most of the static tests (ie: body rule sets like antidrug) spammers
quickly adapt to after a SA release, and they loose some effectiveness over 
time.

However, some dynamic tests have too high a FP rate to have their scores raised
very much. Before raising a score, at least check the S/O ratio in the
STATISTICS*.txt file.

For example RAZOR2_CHECK has a S/O somewhere near 98% (splitting the difference
between set1 at 97.6% and set3 at 98.2%). This means that about 2% of the emails
matched by this rule were in the nonspam pile.

That may not sound bad, but 98% (2% FP rate) is a factor of 20 times worse than
99.9% (0.1% FP rate). (compare the results for RCVD_IN_XBL or URIBL_OB_SURBL to
RAZOR2_CHECK, for example)

Reply via email to