It's interesting but they keep points of failure are clear... One authority, anyone contributes, and completely resistant to pointing don't really seem to go together with what we have seen in many other blacklists.
This was similar to another idea that a few of us had (not regarding blacklist but another system) and you always run into a problem. Either the controller becomes a problem, the contributors become the problem or it gets poisoned and then the whole system becomes a problem. My problem was around accepting XML packets from the wild into our system and praying... Gotta love the PM's... The first problem is how to deal with the DOS. The spammers already figured that out long before we did. Use zombies... How do we do that... Make everyone maintain their own blacklist. But then, it gets stayle so everyone needs to get a copy of the blacklist on a regular basis. Really, they only need the delta change from their last update. So, who maintains it. 10 or so major players that have bandwidth. So, how to they ensure that they don't get DOS'd. It doesn't matter because the list aren't used from their, DOS them means that people just don't get updates for a short time but the list still exists. BTW, you would need to make the list subscription (free, but require some level or login PGP, etc) so that the major players could quickly block a corruptor. The end zombie machine itself (i.e. our email servers) would also be able to add to their local lists and publish to the know spammers list. But this could/would lead to pollution. To resolve that you would need to create a submission system where there is some type of review process (maybe an automated review of even peer review). From their you would need to also find a way to unexpire an address. I think it would be reasonable to add someone to a list of say 96 hours and have a threshold count that says if they haven't send something in x amount of time they go to a lower level queue where they would be monitored with the number of rbl validations for that address/domain. If they exceed a number they get bumped back up (say x per hour where x might be like 50). This leads to another problem of how to update the master lists with the number of lookups because the lookups are being done on the zombie machine. So the end result would be do we trust the zombies that linger in the night to tell us if an IP/domain is a spammer and how many hits they may have had. Do we trust the source of the list and hope they don't give up and die like the last one (can't remember but it was bouncing all incoming email for a day). The solution as a whole is complex and requires some level of trust somewhere in the lines. Otherwise the process as a whole cannot succeed. I stepped aware from the blacklists and went to SA because of these problems. I still do use some RBL's but I don't depend on them in any significant level. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matthew Cline Sent: Friday, January 02, 2004 5:53 PM To: Spam Assassin Talk Subject: [SAtalk] Distributed HTTP server blocklist system [dhttp-bl] An interesting proprosal for a distributed blocklist system. Text found at http://www.sysdesign.ca/dhttp-bl.txt =-=-=-=-=-=-=-=-= Distributed HTTP server blocklist system [dhttp-bl] Copyright (C) 2003 Jem Berkes Posted 2003-09-29 It's clear from spammers' distaste of (and aggression towards) blocklists that these things really are effective in blocking spam :) I'm thinking that the future successful blocklists will be distributed in nature. I've read some good ideas so far, like using existing USENET infrastructure or moderated mailing lists. But I had this thought and wanted to share it, and see what people think. NOTE: this is simply a data distribution system. The maintainers can be the same people we have today, and they still have to add/remove listings by getting feedback through the Internet via some other means. Also, this infrastructure could support multiple distinct blocklists, each with its own service ID. So one strong infrastructure could support different flavours of blocklists (e.g. one run by the spamhaus folk, another by SPEWS) and people could participate in whatever network they wish. ==== Thinking aloud: a distributed HTTP server BL system ==== A. Features --- + one authority retains full control of list, without investing resources + efficient, built in caching, won't burden existing USENET or mail lists + uses existing HTTP servers, which are easy to set up + anybody can contribute by running the CGI, even small home user + HUGE total capacity to serve and grow + very resistant to DDOS attacks + completely resistant to poisoning + some elements of bittorrent, freenet, gnutella B. The <entities> that make up the system: --- 1) A select few "maintainers" that will make ALL decisions about what netblocks will be listed. Since these people have to convince others to participate in their project, they have to be trustworthy and already well known. They will widely distribute their PGP public keys. 2) A large number of "participants". Each runs the system's software on their private/corporate/educational HTTP servers. These people may be friendly or malicious. 3) The basic data unit, a "package" that stores the blocklist for a specific class B (a.b.*.*), and is PGP signed by a maintainer. I estimate that such a compressed, signed file might be about 10KB -- a nice unit to throw around. So for instance, to see if 200.60.243.224 is listed one must seek out the 200.60 package, and examine the contents. They might then find that 200.60.243.100/24 is listed. C. Jobs for each participant's HTTP server: --- o Stores the maintainers' PGP public keys o Stores a few other participants' URLs o Stores up to X packages, covering some % of IPv4 address space o Answers public queries with the appropriate package if available o Refers public queries to other participants if package unavailable o Accept an incoming newer package, only if signature is valid o Always drop stale packages in favour of newer packages o Expire packages with time (TTL) D. How a user queries the BL --- Ideally, a user wishing to make use of the BL would also be a participant. Here is how the user/participant would query the BL status of any given IP: 1) From the first two octets of IP, determine the package required. Note that there are 2^16 ~= 64,000 unique packages in total. 2) Check local storage for package, maybe we already have the data 3) Query other participants' URLs for the package (may get referred) 4) Verify signature on downloaded package, and store (cache) it locally You can see from the relatively small number of unique packages, and the referring nature of queries that it doesn't take long to find a package. Once found, that package is cached. Optimizations can let participants keep track of who to query in the future (like freenet...) E. How maintainers update BL data --- The maintainers, who privately maintain the master list will release updated data packages signed with their keys. The maintainers can inject the new data into the system by uploading their data to any participant. Updated blocklists can even be sent using dial-up modem; it doesn't matter. Participants will accept only this valid data because of the PGP signatures. Newer data invalidates any older versions of the package, and the new data will propagate throughout the distributed network. F. Initial deployment --- Current well known blocklist maintainers would come forward and say they will distribute their lists via dhttp-bl, posting a service ID and their PGP public keys. Internet users and admins who want to participate in this person's blocklist/service then configure their HTTP server to run the necessary CGI scripts, using the service ID and maintainers' keys. These participants with working installations can then advertise their URLs anywhere (search engines, USENET, mailing lists). Maintainers will start uploading their blocklist, partitioned into multiple packages, across many participants. Lesser known participants will pick up the data as necessary. G. DDoS response scenario --- Let's say that the 10 most popular high-bandwidth participants (that everyone uses by default install :) get DDoS'd out of existence. The mildly worried system admin searches google or phones a couple friends, and finds out of any other known participants. That's it -- because all participants store equally reliable data. And a participant who has limited resources might fall back to the task of sending referrals to the numerous lesser known participants, which is also a useful job. H. Extensions --- Although the unit package I'm suggesting is identified by the first two octets of IP, other keyed approaches that can segment address space would work equally well. For example, package id = first N bits of hash of IP Participants could run local DNSBL front-ends to the networked database to use with MTAs or within their organization. Participants can locally store as much or little of the total blocklist as they want. A major ISP can store 100% of the blocklist meaning that BL lookups would be instant unless a package is outdated. ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk