Here's a strange little bug. "socket.getaddrinfo" blows up if given a bad domain name containing ".." in Unicode. The same string in ASCII produces the correct "gaierror" exception.
Actually, this deserves a documentation mention. The "socket" module, given a Unicode string, calls the International Domain Name parser, "idna.py", which has a a whole error system of its own. The IDNA documentation says that "Furthermore, the socket module transparently converts Unicode host names to ACE, so that applications need not be concerned about converting host names themselves when they pass them to the socket module." However, that's not quite true; the IDNA rules say that syntax errors must be treated as errors, so you have to be prepared for IDNA exceptions. They are all "UnicodeError" exceptions. It's worth a mention in the documentation for "socket". John Nagle D:\>/python25/python.exe Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win 32 Type "help", "copyright", "credits" or "license" for more information. >>> ss = 'www.gallery84..com' >>> uss = unicode(ss) >>> import socket >>> socket.getaddrinfo(ss,"http") Traceback (most recent call last): File "<stdin>", line 1, in <module> socket.gaierror: (11001, 'getaddrinfo failed') >>> socket.getaddrinfo(uss,"http") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\python25\lib\encodings\idna.py", line 164, in encode result.append(ToASCII(label)) File "D:\python25\lib\encodings\idna.py", line 73, in ToASCII raise UnicodeError("label empty or too long") UnicodeError: label empty or too long >>> -- http://mail.python.org/mailman/listinfo/python-list