On Tue, 1 Oct 2002 10:56:31 -0500, Robert Strickler <[EMAIL PROTECTED]> writes:
> The biggest problem with converting a url to a checksum, is it wont take > long for the more sophisticated die-hard spammers to hack Apache (if it does > not currently support the capability) to allow a randomized intermediate > directory in their url path > http://www.spammer.com/random-aa1122zzyy/webbug.html. The algorithm is > probably going to have to be prefixed with a regular expression. I am not > quite sure how that could be done for a DNS RBL, which is why mySQL and XML > were run up the flag pole. To synopsize the message you referred to the > things that should be tracked are: How about just having pattern schema's: [].bar.com/boing/[] www.bar.com/moron/[] www.example.com [].twit.org Then, within each domain, you make a DNS database containing A URL like 1.2.com/3/4/5 gets canonicalized as: 5.4.3.xx.1.2. We make the DNS database to add all records for a [] field into the records. For lookups, we lookup: 2. []. Either the first is nxdomain, in which case we use the second, or the first is good, in which we use it it as a root for the next lookup. So, the second lookup is, say, 1.2. [].2. The third: 3.xx.[].2 [].xx.[].2 And so forth. If the final domain lookups to 'bad', we nuke it. To avoid having to backtrack, where, say, we have rules of .2.[]. --> BAD .3.4. --> BAD and we get a: .2.4 If we do the DNS lookup, we follow the second rule. We'd have to backtrack to note that the first rule matches. To fix this, we may add in extra rules for [] rules. We add in another rule like: .2.4. --> BAD This can be done completely automatically. :) It could lead to a multiplicitave/exponential blowup in the rules, but should only occur in contrived cases. Now with this, we can handle DNS based blacklists for things like URL's with blamk spots for randomized portions of the URL. Scott ------------------------------------------------------- This sf.net email is sponsored by: viaVerio will pay you up to $1,000 for every account that you consolidate with us. http://ad.doubleclick.net/clk;4749864;7604308;v? http://www.viaverio.com/consolidator/osdn.cfm _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk