On Monday, February 20, 2006, 12:39:31 PM, Theo Dinter wrote: > Just for some info... I went through the set1 spam logs for 3.1 score > generation.
> 1112804 total messages > 776108 messages hit SURBL > 138407 1 SURBL list(s) hit (1+ = 776108) > 189795 2 SURBL list(s) hit (2+ = 637701) > 281255 3 SURBL list(s) hit (3+ = 447906) > 136964 4 SURBL list(s) hit (4+ = 166651) > 29685 5 SURBL list(s) hit (5+ = 29687) > 2 6 SURBL list(s) hit (6+ = 2) > The set1 ham logs: > 477629 total messages > 1023 messages hit SURBL > 992 1 SURBL list(s) hit (1+ = 1023) > 23 2 SURBL list(s) hit (2+ = 31) > 5 3 SURBL list(s) hit (3+ = 8) > 3 4 SURBL list(s) hit (4+ = 3) > 0 5 SURBL list(s) hit (5+ = 0) > 0 6 SURBL list(s) hit (6+ = 0) > So from these results, the FP rate is very low for SURBL (0.21%), and > while there is a ton of overlap for spam (57.3%), there's very little > for ham (0.01%). Thank you for data. They seem to support what we've been saying. At a count of 138407, messages that hit only 1 SURBL are significant, so lowering the scoring of a single list hit significantly may result in significant FNs. Cheers, Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/