vincent kraeutler added the comment: Some more notes. a) RFC3986 explicitly states that the presented regex (which you use) """ is the regular expression for breaking-down a *well-formed* URI reference into its components. """ (Emphasis added). I am not sure this is a particularly good starting point for parsing potentially security-critical data.
b) The parser fails on URI's containing numerical IPv6 addresses (e.g. "http://[::1]:88/path"). Specifically, the following code in split_authority is broken: if hostport and ':' in hostport: host, port = hostport.split(':', 1) Clearly, if the authority may contain a ":" in the host's IP field, you cannot simply split() off the port part. Again, I am afraid I have no simple solution. Hate to sound so negative. Kind regards, v. _____________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1462525> _____________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com