RE: [SAtalk] Distributed HTTP server blocklist system [dhttp-bl]

Gary Smith Fri, 02 Jan 2004 19:39:03 -0800

It's interesting but they keep points of failure are clear...

One authority, anyone contributes, and completely resistant to pointing don't really 
seem to go together with what we have seen in many other blacklists.


This was similar to another idea that a few of us had (not regarding blacklist but 
another system) and you always run into a problem.  Either the controller becomes a 
problem, the contributors become the problem or it gets poisoned and then the whole 
system becomes a problem.  My problem was around accepting XML packets from the wild 
into our system and praying...  Gotta love the PM's...

The first problem is how to deal with the DOS.  The spammers already figured that out 
long before we did.  Use zombies...  How do we do that... Make everyone maintain their 
own blacklist.  But then, it gets stayle so everyone needs to get a copy of the 
blacklist on a regular basis.  Really, they only need the delta change from their last 
update.  So, who maintains it.  10 or so major players that have bandwidth.  So, how 
to they ensure that they don't get DOS'd.  It doesn't matter because the list aren't 
used from their, DOS them means that people just don't get updates for a short time 
but the list still exists.  BTW, you would need to make the list subscription (free, 
but require some level or login PGP, etc) so that the major players could quickly 
block a corruptor.

The end zombie machine itself (i.e. our email servers) would also be able to add to 
their local lists and publish to the know spammers list.  But this could/would lead to 
pollution.  To resolve that you would need to create a submission system where there 
is some type of review process (maybe an automated review of even peer review). From 
their you would need to also find a way to unexpire an address.  I think it would be 
reasonable to add someone to a list of say 96 hours and have a threshold count that 
says if they haven't send something in x amount of time they go to a lower level queue 
where they would be monitored with the number of rbl validations for that 
address/domain.  If they exceed a number they get bumped back up (say x per hour where 
x might be like 50).  This leads to another problem of how to update the master lists 
with the number of lookups because the lookups are being done on the zombie machine.  
So the end result would be do we trust the zombies that linger in the night to tell us 
if an IP/domain is a spammer and how many hits they may have had.  Do we trust the 
source of the list and hope they don't give up and die like the last one (can't 
remember but it was bouncing all incoming email for a day).

The solution as a whole is complex and requires some level of trust somewhere in the 
lines.  Otherwise the process as a whole cannot succeed.  I stepped aware from the 
blacklists and went to SA because of these problems.  I still do use some RBL's but I 
don't depend on them in any significant level.



 -----Original Message-----
From:   [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]  On Behalf Of Matthew Cline
Sent:   Friday, January 02, 2004 5:53 PM
To:     Spam Assassin Talk
Subject:        [SAtalk] Distributed HTTP server blocklist system [dhttp-bl]

An interesting proprosal for a distributed blocklist system.  Text found at 
http://www.sysdesign.ca/dhttp-bl.txt

=-=-=-=-=-=-=-=-=

Distributed HTTP server blocklist system [dhttp-bl]
Copyright (C) 2003 Jem Berkes

Posted 2003-09-29

It's clear from spammers' distaste of (and aggression towards) blocklists
that these things really are effective in blocking spam :)

I'm thinking that the future successful blocklists will be distributed
in nature. I've read some good ideas so far, like using existing USENET
infrastructure or moderated mailing lists. But I had this thought and
wanted to share it, and see what people think.

NOTE: this is simply a data distribution system. The maintainers can be
the same people we have today, and they still have to add/remove listings
by getting feedback through the Internet via some other means. Also, this
infrastructure could support multiple distinct blocklists, each with its
own service ID. So one strong infrastructure could support different
flavours of blocklists (e.g. one run by the spamhaus folk, another by
SPEWS) and people could participate in whatever network they wish.

==== Thinking aloud: a distributed HTTP server BL system ====

A. Features
---
+ one authority retains full control of list, without investing resources
+ efficient, built in caching, won't burden existing USENET or mail lists
+ uses existing HTTP servers, which are easy to set up
+ anybody can contribute by running the CGI, even small home user
+ HUGE total capacity to serve and grow
+ very resistant to DDOS attacks
+ completely resistant to poisoning
+ some elements of bittorrent, freenet, gnutella

B. The <entities> that make up the system:
---
1) A select few "maintainers" that will make ALL decisions about what
netblocks will be listed. Since these people have to convince others to
participate in their project, they have to be trustworthy and already
well known. They will widely distribute their PGP public keys.

2) A large number of "participants". Each runs the system's software on
their private/corporate/educational HTTP servers. These people may be
friendly or malicious.

3) The basic data unit, a "package" that stores the blocklist for a
specific class B (a.b.*.*), and is PGP signed by a maintainer. I estimate
that such a compressed, signed file might be about 10KB -- a nice unit
to throw around. So for instance, to see if 200.60.243.224 is listed one
must seek out the 200.60 package, and examine the contents. They might
then find that 200.60.243.100/24 is listed.

C. Jobs for each participant's HTTP server:
---
o Stores the maintainers' PGP public keys
o Stores a few other participants' URLs
o Stores up to X packages, covering some % of IPv4 address space
o Answers public queries with the appropriate package if available
o Refers public queries to other participants if package unavailable
o Accept an incoming newer package, only if signature is valid
o Always drop stale packages in favour of newer packages
o Expire packages with time (TTL)

D. How a user queries the BL
---
Ideally, a user wishing to make use of the BL would also be a participant.
Here is how the user/participant would query the BL status of any given IP:

1) From the first two octets of IP, determine the package required.
   Note that there are 2^16 ~= 64,000 unique packages in total.
2) Check local storage for package, maybe we already have the data
3) Query other participants' URLs for the package (may get referred)
4) Verify signature on downloaded package, and store (cache) it locally

You can see from the relatively small number of unique packages, and the
referring nature of queries that it doesn't take long to find a package.
Once found, that package is cached. Optimizations can let participants
keep track of who to query in the future (like freenet...)

E. How maintainers update BL data
---
The maintainers, who privately maintain the master list will release
updated data packages signed with their keys. The maintainers can inject
the new data into the system by uploading their data to any participant.
Updated blocklists can even be sent using dial-up modem; it doesn't matter.

Participants will accept only this valid data because of the PGP signatures.
Newer data invalidates any older versions of the package, and the new data
will propagate throughout the distributed network.

F. Initial deployment
---
Current well known blocklist maintainers would come forward and say they
will distribute their lists via dhttp-bl, posting a service ID and their
PGP public keys. Internet users and admins who want to participate in this
person's blocklist/service then configure their HTTP server to run the
necessary CGI scripts, using the service ID and maintainers' keys. These
participants with working installations can then advertise their URLs
anywhere (search engines, USENET, mailing lists). Maintainers will start
uploading their blocklist, partitioned into multiple packages, across many
participants. Lesser known participants will pick up the data as necessary.

G. DDoS response scenario
---
Let's say that the 10 most popular high-bandwidth participants (that
everyone uses by default install :) get DDoS'd out of existence. The mildly
worried system admin searches google or phones a couple friends, and finds
out of any other known participants. That's it -- because all participants
store equally reliable data. And a participant who has limited resources
might fall back to the task of sending referrals to the numerous lesser
known participants, which is also a useful job.

H. Extensions
---
Although the unit package I'm suggesting is identified by the first two
octets of IP, other keyed approaches that can segment address space would
work equally well. For example, package id = first N bits of hash of IP

Participants could run local DNSBL front-ends to the networked database
to use with MTAs or within their organization.

Participants can locally store as much or little of the total blocklist
as they want. A major ISP can store 100% of the blocklist meaning that BL
lookups would be instant unless a package is outdated.



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] Distributed HTTP server blocklist system [dhttp-bl]

Reply via email to