Original proposal posted December 2001, heh. "Nothing new under the sun".
To answer my own question, mySQL _does_ do replication the bad news is that the files are not OS/archetecture agnostic, you cannot replicate between a Sun unix and RH Linux, so it does not appear to be a viable distribution method. The biggest problem with converting a url to a checksum, is it wont take long for the more sophisticated die-hard spammers to hack Apache (if it does not currently support the capability) to allow a randomized intermediate directory in their url path http://www.spammer.com/random-aa1122zzyy/webbug.html. The algorithm is probably going to have to be prefixed with a regular expression. I am not quite sure how that could be done for a DNS RBL, which is why mySQL and XML were run up the flag pole. To synopsize the message you referred to the things that should be tracked are: url regular expression a RE might be too overblown, just a wildcard placeholder should be OK spam level automatic rules should use a lower value than hand screened nominations 0 could disable the record while under review or accumulating nominations. submission date unix seconds timestamp of nomination last seen date unix timestamp of last unique nomination expiration date calculated value of when to purge from database # unique reports how many reports targeted the url pattern reporter list a table of ip/names submitting the nomination reporter weight similar to razor, the more accurate the reporting, the higher the weight score a calculated recommended score -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 01, 2002 4:52 AM To: SpamTalk Cc: [EMAIL PROTECTED] Subject: Re: [SAtalk] URL blacklist SpamTalk said: > Probably better than the "spam phrases" approach would be the database > approach as currently used for white/black listing. Any way to tie > that to an XML retrieval from a list of central repositories? Does > mySQL do replication? A properly done XML would let us eyeball the > list as well as use it to keep the database up to date. Another idea: > could we synthesize an RBL so that > http://www.spammer.com/spam/web/bug/ becomes > spam.web.bug.x.www.spammer.com for a reverse lookup? It is going to > get tricky, how to specify a randomized intermediate directory? A good plan, needs an implementation though: http://bl.reynolds.net.au/ksi/email/ hmm. seems down to me. Basically it's a plan to store hash sums of URLs/phone numbers found in spam in a DNSBL, for apps like SpamAssassin to look up. A little like spamcop's "spamvertized URL" list... --j. ------------------------------------------------------- This sf.net email is sponsored by: DEDICATED SERVERS only $89! Linux or FreeBSD, FREE setup, FAST network. Get your own server today at http://www.ServePath.com/indexfm.htm _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk