All this hubub about not filtering the list has made me come to a realization.
The SURBL URIBLs are collectively massively over-scored in SA 3.1.0. The problem lies in the SURBL lists, over time, become largely redundant with one another. A URI may be first listed by one or another of the SURBL lists, as each has their own feeds, but if it's really used in spam run it will quickly get listed in 3 or more of them. Take for example this ONE uri that was posted to the list: checpri *MUNGED*.com This is currently listed in SC, JP, and AB on SURBL. score URIBL_AB_SURBL 0 3.306 0 3.812 score URIBL_JP_SURBL 0 3.360 0 4.087 score URIBL_SC_SURBL 0 3.600 0 4.498 In a set3 configuration that's 12.397 points, just for having ONE URI in the message. I don't know about you, but it strikes me as rather excessive. Compare this to the RBLs supported by 3.1.0. XBL is the highest scoring RBL and it's only 3.8.. You'd have to be listed in at least 5 RBLs to break 12 points with SA 3.1.0.. The highest four scoring RBLs are: score RCVD_IN_XBL 0 3.114 0 3.897 score RCVD_IN_NJABL_SPAM 0 1.905 0 2.775 score RCVD_IN_DSBL 0 1.801 0 2.600 score RCVD_IN_WHOIS_BOGONS 0 1.811 0 2.430 ---------- 11.702 Also consider that these lists have highly diverse listing criteria, and merely sourcing spam is not enough to get listed in all of these 4. Yet a mere 3 URIBLs sails right past the 12 point mark with ease. And these 3 URIBLs (as well as OB and uribl.com's BLACK) all have highly similar listing criteria. They all list on slightly different policies, but when you remove the fine details they all list based on "domains reported as spam which don't appear to be used legitimately". The differences exist in where they collect reports from, and how much checking they do for legitimacy. The other problem is that I've seen a repeated pattern where FP's get reported to more than one list. In fact, I rarely see a FP that isn't at least double-listed. (For example I had the download site for paid-registered upgrades to a programmer's text editor get double-listed recently. It will cost you $39.95 to get signed up for that "spam") This makes me wonder if SA wouldn't be better off having some kind of meta rules that simply count how many URIBLs the message is listed in, or at least some kind of score-limiting feedback on multiple hits. This would allow lists to score high individually, but prevent overlapping FPs from being driven into astronomical score levels just for containing a single URI that someone mis-reported to multiple sources. Thoughts, concepts?