On Fri, Oct 22, 2010 at 5:19 AM, Michael Scheidell <michael.scheid...@secnap.com> wrote: > On 10/21/10 8:50 PM, dar...@chaosreigns.com wrote: >> >> I'd like to try collecting reputation data for every IP address from >> everyone willing to submit it.
> re-inventing the wheel. If what's being suggested is a non-commercial alternative to a commercial product, then I think that the pejorative connotations of "re-inventing the wheel" don't apply. :-) This is a wheel that needs re-inventing, and begs for an RFC. OK, a bit of brainstorming here, indulge me a bit ... Imagine an open standard that commercial and non-commercial vendors could implement. It could scale by letting each site choose which peers to share reputation with. One could assign relative "trust" levels to different peers, depending on how much a site believes that their spam judgments overlap with another's. ISPs might more closely peer with each other, as could small businesses, etc. If it were left up to the individual admins to decide how to figure out if something is spam or not, then some sites would be subjectively "better" at spam rating than others - fodder for a meta-reputation system. If my system and Bob's system have 10,000 IPs in common, and there is good overlap in our true/false positives/negatives, then I could tell my system to put more faith in his ratings of IPs that I haven't seen yet, and vice versa. (This would work sort of like Netflix ratings, where someone who has tastes very similar to mine really liked this movie that I haven't seen, so I'm likely to like it, too.) I've read multiple places that if 500 people all guess how many marbles are in a jar, then while there may be wide variation in the guesses, the average is remarkably close to the real count. While there's no hard "real" spamminess value (because it's relative, as others have said in this thread), I'd bet that the aggregate would be very useful. Seeing the statistical spread (like Amazon does where if, out of 500 ratings, 480 people gave it 5 stars, 15 gave it 4, etc., then consensus is pretty clear that it's a cool item) and being able to programmatically act on that would be sweet. Decentralized distribution would be the tricky part, of course. :-) Royce