Georg Brandl added the comment: Hmm, you're right. The behavior has been like this at least since Python 2.5:
Python 2.5.4 (r254:67916, Dec 16 2012, 20:33:12) [GCC 4.6.3] on linux3 Type "help", "copyright", "credits" or "license" for more information. >>> from urlparse import urlparse >>> urlparse('www.cwi.nl:80/%7Eguido/Python.html') ('www.cwi.nl', '', '80/%7Eguido/Python.html', '', '', '') The docs refer to RFC 1808. From a quick glance at the BNF in section 2.2, RFC 1808 allows dots in the scheme, but also allows ":" in the path. So there seems to be a parsing ambiguity, but see section 2.4.2: If the parse string contains a colon ":" after the first character and before any characters not allowed as part of a scheme name (i.e., any not an alphanumeric, plus "+", period ".", or hyphen "-"), the <scheme> of the URL is the substring of characters up to but not including the first colon. These characters and the colon are then removed from the parse string before continuing. That would indicate that the implementation is correct and the documentation should be fixed. Senthil? ---------- keywords: +buildbot -patch status: closed -> open _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16932> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com