John Hardin wrote: > On Tue, 28 Apr 2009, Steve Freegard wrote: > >> To reduce the likelihood of collisions then it's better to add the input >> string length at the end of the md5 like ClamAV does in it's MD5 sigs >> e.g. >> >> s...@laptop-smf:~$ perl -MDigest::MD5 -e '$email="s...@fsg.com"; print >> Digest::MD5::md5_hex($email).length($email).".emailbl.org\n"' >> c18782f8d94595d5e016e3ab9ab3f8f610.emailbl.org >> >> This also has the benefit of making it impossible to reverse the list >> if the spammer were to rsync the list. > > ...huh? If MD5 isn't cryptographically secure, how will adding some > extra characters onto the end make it stronger?
Well in the case of an emailBL - the worst that can happen is that one listed md5 collides with an innocent e-mail address. By adding in the string length it reduces that possibility because both colliding addresses would have to be exactly the same length. I believe you'll find that ClamAV uses this method for it's MD5 signatures - to get a match it has to match the MD5 and the file size has to match. > And there's no way to keep a spammer from checking to see if a given > email address is listed, just as there's no way to keep them from > checking whether a given domain name is listed. Ok - you're right. It's late here ;-) Cheers, Steve.