Re: my emailBL is live!

Mike Cardwell Wed, 29 Apr 2009 02:02:01 -0700

Adam Katz wrote:

Mike Cardwell contended:

It would definitely require a hashing algorithm, like MD5. IIRC
there is a maximum length for a hostname, and that is 255
characters. What if the hostname in your email address is 255
characters long on it's own...?


When MD5sums were first proposed (in place of my wild escaping), it
seemed like a great idea.  However, a voice in the back of my head,
now spoken (typed?) by Rob, has been growing louder.  My
implementation now merely truncates email usernames to 16 characters
(plus the noted defanging, which makes it complicated again ...) and
replaces the @ with a dot (not an underscore, that's not a legal
character).

Hmmm. I'm still not convinced you've done it the best way. Thatconversion sounds a lot more complicated than a straight MD5 conversion,and it doesn't deal with the fact that there is a maximum length for anFQDN.

In fact, collisions here could be regarded as good, as usernames that
long can include tracking strings (e.g. the mailer for our list,
users-return-12345-joe=bob.com@ spamassassin.apache.org, becomes
users-return-123.spamassassin.apache.org), which should help.

That could be seen as an advantage I suppose. But, the particular sourcelist being used here wasn't meant to be used that way. Some people mightconsider such hits as false positives.

I did fully implement my proposed latter 16 characters (of MD5's 32)
plus dot plus the domain, complete with hash lookups, but I just
removed it (which is why non-test lookups will fail for the next ~4h).

Having access to the plain text email address would only make it
easier for ISPs to do anything if they had access to the zone file.
In which case, you could just give them access to a separate list
which has the email addresses in plain text.


Unless we're replacing the currently well-groomed upstream source at
http://anti-phishing-email-reply.googlecode.com/#, I see no reason to
offer such services (since they do it better).

So in rbldnsd, ...


Whoa, what's that?!  Interesting ... it's even in Debian.  I think I'm
happy with BIND for the moment, since my origin point is hidden from
use and the actual NS records are merely slaves run by zoneedit (so
efficiency isn't really important).  I probably need to stay on BIND
as I doubt I could use rbldnsd to host my SpamAssassin channels.

I implemented pretty much exactly the same thing that you did, except ituses a straight hexadecimal MD5 digest of the full address. I know thisisn't strictly correct as the local part of an email address istechnically case sensitive, but as email addresses in the real world arecase *insensitive* I convert it to lower case before hashing.

Eg:

r...@haven:/var/lib/rbldns# host -t abda05135a5b8a92d5d2934531864442d.phishing.email.rbl.grepular.combda05135a5b8a92d5d2934531864442d.phishing.email.rbl.grepular.comA 127.0.0.3bda05135a5b8a92d5d2934531864442d.phishing.email.rbl.grepular.comA 127.0.0.1r...@haven:/var/lib/rbldns# host -t txtbda05135a5b8a92d5d2934531864442d.phishing.email.rbl.grepular.combda05135a5b8a92d5d2934531864442d.phishing.email.rbl.grepular.comTXT "20090411"

r...@haven:/var/lib/rbldns#

That RBL wont stay public for long so don't use it for anything otherthan a quick test.


Here's the code I use to download the data and populate an rbldnsd file:

https://secure.grepular.com/phishing_addresses.txt

You might find something you can strip out and re-use.

Here are the Exim acls I use to query it for the envelope sender, Fromheader and Reply-to headers:


acl_smtp_mail:

deny dnslists =phishing.email.rbl.grepular.com/${md5:${lc:$sender_address}}


acl_smtp_data:

deny dnslists =phishing.email.rbl.grepular.com/${md5:${lc:${address:$h_From:}}}

deny dnslists =phishing.email.rbl.grepular.com/${md5:${lc:${address:$h_Reply-To:}}}

I'm not familiar enough with writing SpamAssassin rules yet to write aSpamAssassin recipe.


--
Mike Cardwell
(https://secure.grepular.com/) (http://perlcv.com/)

Re: my emailBL is live!

Reply via email to