On 12/29/2014 4:36 PM, Mike Cardwell wrote:
I'd like to store hostnames in a postgres database and I want to fully support
IDNs (Internationalised Domain Names)
I want to be able to recover the original representation of the hostname, so I
can't just encode it with punycode and then store the ascii result. For example,
these two are the same hostnames thanks to unicode case folding [1]:
tesst.ëxämplé.com
teßt.ëxämplé.com
They both encode in punycode to the same thing:
xn--tesst.xmpl.com-cib7f2a
Don't believe me, then try visiting any domain with two s's in, whilst replacing
the s's with ß's. E.g:
ericßon.com
nißan.com
americanexpreß.com
So if I pull out "xn--tesst.xmpl.com-cib7f2a" from the database, I've no idea
which of those two hostnames was the original representation.
The trouble is, if I store the unicode representation of a hostname instead,
then when I run queries with conditions like:
WHERE hostname='nißan.com'
_IF_ Postgres had a punycode function, then you could use:
WHERE punycode(hostname) = punycode('nißan.com')
-Andy
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general