Bugs item #1457264, was opened at 2006-03-23 20:49
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1457264&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Steve (onlynone)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib.splithost parses incorrectly

Initial Comment:
urllib.splithost(url) requires that the url passed in
be of the form '//host[:port]/path'. Yet I've run
across some urls that are of the form
'//host[:port]?querystring'. This causes splithost to
return everything as the host and nothing as the path.


Section 3.2 of rfc2396 (Uniform Resource Identifiers:
Generic Syntax) states that 'The authority component is
preceded by a double slash "//" and is terminated by
the next slash "/", question-mark "?", or by the end of
the URI.'

Also, this is how it defines a URI:

absoluteURI   = scheme ":" ( hier_part | opaque_part )
hier_part     = ( net_path | abs_path ) [ "?" query ]
net_path      = "//" authority [ abs_path ]
abs_path      = "/"  path_segments

Based on the above, you could certainly have:
'http://authority?query' as a valid url.


In python2.3 you would just need to change line 939 in
urllib.py from:

        _hostprog = re.compile('^//([^/]*)(.*)$')

to:

        _hostprog = re.compile('^//([^/?]*)(.*)$')

This appears to affect all python versions, I just
happened to be using 2.3.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1457264&group_id=5470
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to