Original proposal posted December 2001, heh. "Nothing new under the sun".

To answer my own question, mySQL _does_ do replication the bad news is that
the files are not OS/archetecture agnostic, you cannot replicate between a
Sun unix and RH Linux, so it does not appear to be a viable distribution
method.

The biggest problem with converting a url to a checksum, is it wont take
long for the more sophisticated die-hard spammers to hack Apache (if it does
not currently support the capability) to allow a randomized intermediate
directory in their url path
http://www.spammer.com/random-aa1122zzyy/webbug.html. The algorithm is
probably going to have to be prefixed with a regular expression. I am not
quite sure how that could be done for a DNS RBL, which is why mySQL and XML
were run up the flag pole. To synopsize the message you referred to the
things that should be tracked are:

url regular expression          a RE might be too overblown, just a wildcard
placeholder should be OK

spam level                              automatic rules should use a lower
value than hand screened nominations
                                        0 could disable the record while
under review or accumulating nominations.

submission date                 unix seconds timestamp of nomination
last seen date                  unix timestamp of last unique nomination
expiration date                 calculated value of when to purge from
database
# unique reports                        how many reports targeted the url
pattern
reporter list                   a table of ip/names submitting the
nomination
reporter weight                 similar to razor, the more accurate the
reporting, the higher the weight
score                                   a calculated recommended score

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, October 01, 2002 4:52 AM
To: SpamTalk
Cc: [EMAIL PROTECTED]
Subject: Re: [SAtalk] URL blacklist 



SpamTalk said:
> Probably better than the "spam phrases" approach would be the database 
> approach as currently used for white/black listing. Any way to tie 
> that to an XML retrieval from a list of central repositories? Does 
> mySQL do replication? A properly done XML would let us eyeball the 
> list as well as use it to keep the database up to date. Another idea: 
> could we synthesize an RBL so that 
> http://www.spammer.com/spam/web/bug/ becomes 
> spam.web.bug.x.www.spammer.com for a reverse lookup? It is going to 
> get tricky, how to specify a randomized intermediate directory?

A good plan, needs an implementation though: 

        http://bl.reynolds.net.au/ksi/email/

hmm. seems down to me.  Basically it's a plan to store hash sums of
URLs/phone numbers found in spam in a DNSBL, for apps like SpamAssassin to
look up.  A little like spamcop's "spamvertized URL" list...

--j.


-------------------------------------------------------
This sf.net email is sponsored by: DEDICATED SERVERS only $89!
Linux or FreeBSD, FREE setup, FAST network. Get your own server 
today at http://www.ServePath.com/indexfm.htm
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to