Re: Over-scoring of SURBL lists...

Jeff Chan Wed, 15 Feb 2006 23:07:49 -0800

On Wednesday, February 15, 2006, 7:00:33 PM, Matt Kettler wrote:
> 2) diversity of criteria:
> SURBL - all lists have nearly identical listing criteria, except ph. All but 
> PH
> are "spotted in spam, doesn't appear to have legit use" and nothing more. JP,
> AB, SC, WS and OB are all effectively the same list with different input 
> points.


The various SURBL lists and URIBL.com may have similar listing
criteria, but their original data sources and processing
technologies (think listing rules/logic) are mostly very
different.  That they happen to notice some of the same domains
could be taken as independent verification of spammyness.  If so
I think there is a value in having the scores add as they do.

Perhaps a person to address this issue is someone like Henry
Stern whose Perceptron system is used to generate the scores.

OTOH you may have a valid point that most of the other non-URIBL
SA rules are mostly unrelated, whereas the URIBL rules are all
about the same topic: inclusion in URIBL lists.  As such, perhaps
the score generation system should not treat them like the other
mostly unrelated rules.

OTOOH the Perceptron scoring system is literally results-driven
at least over the test corpora, and it's often hard to argue with
results.

> DNSBLs - lists have wildly different listing criteria. Some are identical to
> each other, but there are 4 different criteria in the top 5.

And wildly different FP rates.  It's not too surprising that some
are scored quite low, while others like XBL are scored relatively
high.  It's probably only the low scores that make most of them
slightly useful, unlike lists such as XBL, which is highly
useful.


BTW, if you or anyone finds any FPs on SURBLs, *****please*****
report them to whitelist at surbl dot org.  We really need
everyone's help with this.  If you use our data please help
improve our community with your feedback!   If we could make
one condition of use, that would be it.

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/

Re: Over-scoring of SURBL lists...

Reply via email to