vincent kraeutler added the comment:

Some more notes. 
a) RFC3986 explicitly states that the presented regex (which you use)
   """ is the regular expression for breaking-down a *well-formed* URI
reference into its components. """ (Emphasis added). I am not sure this
is a particularly good starting point for parsing potentially
security-critical data.

b) The parser fails on URI's containing numerical IPv6 addresses (e.g.
"http://[::1]:88/path";). Specifically, the following code in
split_authority is broken:

    if hostport and ':' in hostport:
        host, port = hostport.split(':', 1)

Clearly, if the authority may contain a ":" in the host's IP field, you
cannot simply split() off the port part.

Again, I am afraid I have no simple solution. Hate to sound so negative.

Kind regards,
v.

_____________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1462525>
_____________________________________
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to