An interesting proprosal for a distributed blocklist system. Text found at http://www.sysdesign.ca/dhttp-bl.txt
=-=-=-=-=-=-=-=-= Distributed HTTP server blocklist system [dhttp-bl] Copyright (C) 2003 Jem Berkes Posted 2003-09-29 It's clear from spammers' distaste of (and aggression towards) blocklists that these things really are effective in blocking spam :) I'm thinking that the future successful blocklists will be distributed in nature. I've read some good ideas so far, like using existing USENET infrastructure or moderated mailing lists. But I had this thought and wanted to share it, and see what people think. NOTE: this is simply a data distribution system. The maintainers can be the same people we have today, and they still have to add/remove listings by getting feedback through the Internet via some other means. Also, this infrastructure could support multiple distinct blocklists, each with its own service ID. So one strong infrastructure could support different flavours of blocklists (e.g. one run by the spamhaus folk, another by SPEWS) and people could participate in whatever network they wish. ==== Thinking aloud: a distributed HTTP server BL system ==== A. Features --- + one authority retains full control of list, without investing resources + efficient, built in caching, won't burden existing USENET or mail lists + uses existing HTTP servers, which are easy to set up + anybody can contribute by running the CGI, even small home user + HUGE total capacity to serve and grow + very resistant to DDOS attacks + completely resistant to poisoning + some elements of bittorrent, freenet, gnutella B. The <entities> that make up the system: --- 1) A select few "maintainers" that will make ALL decisions about what netblocks will be listed. Since these people have to convince others to participate in their project, they have to be trustworthy and already well known. They will widely distribute their PGP public keys. 2) A large number of "participants". Each runs the system's software on their private/corporate/educational HTTP servers. These people may be friendly or malicious. 3) The basic data unit, a "package" that stores the blocklist for a specific class B (a.b.*.*), and is PGP signed by a maintainer. I estimate that such a compressed, signed file might be about 10KB -- a nice unit to throw around. So for instance, to see if 200.60.243.224 is listed one must seek out the 200.60 package, and examine the contents. They might then find that 200.60.243.100/24 is listed. C. Jobs for each participant's HTTP server: --- o Stores the maintainers' PGP public keys o Stores a few other participants' URLs o Stores up to X packages, covering some % of IPv4 address space o Answers public queries with the appropriate package if available o Refers public queries to other participants if package unavailable o Accept an incoming newer package, only if signature is valid o Always drop stale packages in favour of newer packages o Expire packages with time (TTL) D. How a user queries the BL --- Ideally, a user wishing to make use of the BL would also be a participant. Here is how the user/participant would query the BL status of any given IP: 1) From the first two octets of IP, determine the package required. Note that there are 2^16 ~= 64,000 unique packages in total. 2) Check local storage for package, maybe we already have the data 3) Query other participants' URLs for the package (may get referred) 4) Verify signature on downloaded package, and store (cache) it locally You can see from the relatively small number of unique packages, and the referring nature of queries that it doesn't take long to find a package. Once found, that package is cached. Optimizations can let participants keep track of who to query in the future (like freenet...) E. How maintainers update BL data --- The maintainers, who privately maintain the master list will release updated data packages signed with their keys. The maintainers can inject the new data into the system by uploading their data to any participant. Updated blocklists can even be sent using dial-up modem; it doesn't matter. Participants will accept only this valid data because of the PGP signatures. Newer data invalidates any older versions of the package, and the new data will propagate throughout the distributed network. F. Initial deployment --- Current well known blocklist maintainers would come forward and say they will distribute their lists via dhttp-bl, posting a service ID and their PGP public keys. Internet users and admins who want to participate in this person's blocklist/service then configure their HTTP server to run the necessary CGI scripts, using the service ID and maintainers' keys. These participants with working installations can then advertise their URLs anywhere (search engines, USENET, mailing lists). Maintainers will start uploading their blocklist, partitioned into multiple packages, across many participants. Lesser known participants will pick up the data as necessary. G. DDoS response scenario --- Let's say that the 10 most popular high-bandwidth participants (that everyone uses by default install :) get DDoS'd out of existence. The mildly worried system admin searches google or phones a couple friends, and finds out of any other known participants. That's it -- because all participants store equally reliable data. And a participant who has limited resources might fall back to the task of sending referrals to the numerous lesser known participants, which is also a useful job. H. Extensions --- Although the unit package I'm suggesting is identified by the first two octets of IP, other keyed approaches that can segment address space would work equally well. For example, package id = first N bits of hash of IP Participants could run local DNSBL front-ends to the networked database to use with MTAs or within their organization. Participants can locally store as much or little of the total blocklist as they want. A major ISP can store 100% of the blocklist meaning that BL lookups would be instant unless a package is outdated. ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk