Adam Katz wrote:
> Steve Freegard wrote:
>> I've been thinking about creating an emailBL to target dropboxes used
>> for 419 scams, phishing, russian penpals etc. as I have a reasonable way
>> to collect these in real-time and it would close a lot of doors on these
>> folks provided I can avoid being caught by address stuffing.
>>
>> However - rather than trying to do some sort of munging to work with
>> DNS; I was simply going to either MD5 or SHA1 the e-mail address e.g.
>>
>> s...@laptop-smf:~$ perl -MDigest::MD5 -e 'print
>> Digest::MD5::md5_hex("s...@fsg.com").".emailbl.org\n"'
>> 132e76bc8e252dee7c911ea2cde1f079.emailbl.org
> 
> I'm under the impression that DNSBLs reverse the IP address (e.g.
> 2.0.0.127.bl.spamcop.net) so the hierarchical ordering can be
> preserved

Yeah - only really relevant if you are listing IP addresses and want to
use wildcards in the DNS zone.  For listing e-mail addresses it isn't
relevant.

>  but checksumming would be *significantly* better than my
> proposal.  Perhaps just the username, and perhaps a tighter hash (more
> collisions, less DNS traffic), e.g. for your proffered sa @ fsg.com:
> 
> $ perl -MDigest::MD5 -e 'print substr(Digest::MD5::md5_hex("sa"),16) .
> ".fsg.com.emailbl.org\n"'
> 7e1e9e4aedb8242d.fsg.com.emailbl.org
> $
> 

Nah - I really don't like it that way; it doesn't really bring you any
benefit and is more likely to cause collisions if you do it that way.
Don't see how it can cause less DNS traffic either.  At least using MD5
hashes your DNS query will only be 32 characters + blacklist zone name
regardless of the size of the input string.

To reduce the likelihood of collisions then it's better to add the input
string length at the end of the md5 like ClamAV does in it's MD5 sigs e.g.

s...@laptop-smf:~$ perl -MDigest::MD5 -e '$email="s...@fsg.com"; print
Digest::MD5::md5_hex($email).length($email).".emailbl.org\n"'
c18782f8d94595d5e016e3ab9ab3f8f610.emailbl.org

This also has the benefit of making it impossible to reverse the list if
the spammer were to rsync the list.

>> If you want to separate stuff out into different meanings e.g. the
>> Google Anti-Phishing stuff; then just use a different sub-domain for each.
> 
> Ah, but DNSBLs and URIBLs already have that ability; they can answer
> anything in the 127.0.0.0/8 space.  Using a different sub-domain would
> mean differing DNS lookups, which means more traffic (which is why if
> you look at the SA code for Spamhaus's DNSBL, all queries go to
> zen.spamhaus.org).
> 

Yeah - you're absolutely right.  Be sure to read
http://tools.ietf.org/html/draft-irtf-asrg-bcp-blacklists-05 if you are
going to publish a public list.

Regards,
Steve.

Reply via email to