Senthil added the comment:

If we observe carefully in the urlparse.py and test_urlparse.py, over
the releases from Python 2.3 to Python 2.6, the changes required for
supporting RFC2396 has been implemented. (RFC2396 replaced 1808 in URL
Specification.)
But the header of urlparse.py still says it is according to RFC1808
only. *This needs to be changed.*
In the test_urlparse.py we find test cases for RFC2396 compliance as well.

In this specific bug report, we are upon a case where the later
Specification is not compatible with older one.

As per RFC1808
Base: <URL:http://a/b/c/d;p?q#f>

Relative URL resolution:
?y         = <URL:http://a/b/c/d;p?y>
;x         = <URL:http://a/b/c/d;x>

As per RFC2396
Base: http://a/b/c/d;p?q

Relative URLS:
?y            =  http://a/b/c/?y
;x            =  http://a/b/c/;x

Do you see the difference?
urlparse.py has been made RFC2396 compliance, so that above incompatible
test has been removed as well.

Now, even RFC2396 is obsolete and has been superseded by RFC3986
which advertises thus:

Base: http://a/b/c/d;p?q

Relative URL:

"?y"            =  "http://a/b/c/d;p?y";
";x"            =  "http://a/b/c/;x";

this is crazy, the first ?y goes for  older RFC1808 result and second ;x
is in the later RFC2396.

For the just this issue my take would be:
1) Make the current urlparse.py compliant with RFC2396. Remove the claim
that it is compliant with 1808 only. Which is a documentation fix (patch
attached)

Overall and the best solution will be RFC3986 compliance, which is a
separate effort.

Added file: http://bugs.python.org/file9035/urlparse.patch

__________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1432>
__________________________________
Index: Lib/test/test_urlparse.py
===================================================================
--- Lib/test/test_urlparse.py	(revision 59600)
+++ Lib/test/test_urlparse.py	(working copy)
@@ -164,6 +164,12 @@
         #self.checkJoin(RFC1808_BASE, 'http:g', 'http:g')
         #self.checkJoin(RFC1808_BASE, 'http:', 'http:')
 
+        # RFC 2396 has different behaviour for the following.
+        # So RFC 1808 compliance for these cases is not present, but rather
+        # RFC 2396 specification is followed (see the tests under 2396)
+        # self.checkJoin(RFC2396_BASE, '?y', 'http://a/b/c/d;p?y')
+        # self.checkJoin(RFC2396_BASE, ';x', 'http://a/b/c/d;x')
+    
     def test_RFC2396(self):
         # cases from RFC 2396
 
Index: Lib/urlparse.py
===================================================================
--- Lib/urlparse.py	(revision 59600)
+++ Lib/urlparse.py	(working copy)
@@ -2,6 +2,14 @@
 
 See RFC 1808: "Relative Uniform Resource Locators", by R. Fielding,
 UC Irvine, June 1995.
+
+Update: The module has also been updated to satisfy the changes suggested
+in RFC 2396: "Uniform Resource Identifiers (URI): Generic Syntax", 
+August, 1998.
+
+Changes specified in RFC 3986:"Uniform Resource Identifiers (URI): Generic
+Syntax", January 2005 not implemented yet.
+
 """
 
 __all__ = ["urlparse", "urlunparse", "urljoin", "urldefrag",
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to