On Friday, February 17, 2006, 5:34:57 PM, Matthew Eerde wrote: > It's not particularly important how many URLs the lists have in > common. What is important is how many *false positives* the > lists have in common... or more to the point, whether a given "good" URL > is more likely to be on (say) JP given that it's on (say) SC.
> If > P(JP | SC ^ good) >> P(JP | good) > then your point is accurate, and the rules should be re-scored. > Otherwise the rules are fine. The FP rates on SC and JP are consistently very low across the scores that have been posted by many people. OB and WS have higher FP rates, but they don't tend to have a lot of FPs in common. We review our data for FPs every day. Unless someone can provide some examples of FPs on multiple SURBL lists, I think Matt K. may be worried about a situation that doesn't happen very often. We are constantly looking for FPs and FNs and for ways to improve performance. We obsess over FPs as much as anyone, and frankly we don't find many. If anyone does spot any FPs, I sure wish they would report them to us: whitelist at surbl. org Instead of making accusations that seem vague and groundless, please give us feedback on any FPs! Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/