Matt Kettler wrote: > I'll even re-quote myself: >> I personally would like to see some statistics, but at this point, we >> don't have any test data on this so we're arguing your theory vs mine. > And your quote that I was counter-pointing: >> As you can see the performance of the lists are different, and the way >> they're created is different too. > > I don't see enough of a difference to clearly rule out significant overlap. > > I'll define my test of "significant overlap" as: >> 10% of total hits redundant across 3 or more lists and >1% nonspam hits > redundant across 2 or more lists. >
Messages received today that are double-listed in two or more of SC, JP, AB, OB and WS: grep "SURBL_MULTI2" /var/log/maillog |grep "Feb 17" |wc -l 292 All surbl.org hits in same timeframe (includes ph, but no matter): grep "_SURBL" /var/log/maillog |grep "Feb 17" |wc -l 583 So we at least have a 50% double-listing rate. That in-and-of-itself isn't much of a problem, but it also doesn't rule out overlap. It's still a whole lot higher than my first criteria of 10% overlap However, right now I don't have more than 100 FPs so I can't really comment on the nonspam hit rate of SURBL_MULTI2. That's the important one. I also added multi3, multi4 and another rule to detect overlap between uribl.com's black and surbl.org: meta URIBL_BLACK_OVERLAP (URIBL_BLACK && (URIBL_AB_SURBL || URIBL_JP_SURBL || URIBL_OB_SURBL || URIBL_WS_SURBL || URIBL_SC_SURBL)) score URIBL_BLACK_OVERLAP -1.0 I'll see what kind of runtime data I can gather based on these rules over the weekend.