David B Funk wrote:

When MD5sums were first proposed (in place of my wild escaping), it
seemed like a great idea.  However, a voice in the back of my head,
now spoken (typed?) by Rob, has been growing louder.  My
implementation now merely truncates email usernames to 16 characters
(plus the noted defanging, which makes it complicated again ...) and
replaces the @ with a dot (not an underscore, that's not a legal
character).

Repeat after me, ALMOST ALL characters (octets actually) are now
LEGAL in DNS queries (see RFC-2181 section 11).

There is NO need for -any- kind of munging.

That same RFC says labels are limited to 63 chars and FQDNs are limited to 255 chars. So you'd need to mung for those two cases wouldn't you? Also, are you 100% sure there are no characters that are allowed in an email address local part which aren't allowed in a domain name?

I've set up an emailBL directly from the Google list, try:

 host abus...@live.com.phish.icaen.uiowa.edu.

"host" on my Debian system spits out warnings. It does however do the lookup correctly. You must recognise that there will be compatibility problems with your solution in the wild though. One example being Exim's dnsdb lookup type, which fails outright doing that lookup.

Here's the warning I get from "host".

host -t a abus...@live.com.phish.icaen.uiowa.edu
*** invalid answer name abuse...@live.com.phish.icaen.uiowa.edu after A query for abus...@live.com.phish.icaen.uiowa.edu
abuse...@live.com.phish.icaen.uiowa.edu A       127.0.0.2
 !!! abuse...@live.com.phish.icaen.uiowa.edu A record has illegal name

What exactly is the problem with hashing the address anyway? We'll forget accidental collisions as they simply wont happen.

IE "address.phish.icaen.uiowa.edu"

NO need for hashing, no collsions, etc.
>
Also makes it easier to deploy into an address filter/blocker in
your smtp-MTA (to prevent local llusers from being reply to one
of those addresses).
>

BTW notice that the Google data is multi-valued in the TYPE field.
rather than a simple enumeration of that data into an address it
is better to turn it into a bit-mask, as then multiple values can
be represented (and queried) in a single address/operation.

EG: A == 127.0.0.2
    B == 127.0.0.4
    C == 127.0.0.8
    D == 127.0.0.16

thus AB == 127.0.0.6
    AC == 127.0.0.10

etc.

So the entry for 'abus...@live.com' only has an 'A' type.

 host account-teamd...@live.com.phish.icaen.uiowa.edu. => 127.0.0.10

so the entry for 'account-teamd...@live.com' has an 'A' & 'C' type.

Yeah, that might be a good idea.

--
Mike Cardwell
(https://secure.grepular.com/) (http://perlcv.com/)

Reply via email to