John Nagle schrieb: > Here's a URL, found in a link, which gives us trouble > when we try to follow the link: > > http://sportsbra.co.uk/../acatalog/shop.html > > Browsers immediately turn this into > > http://sportsbra.co.uk/acatalog/shop.html > > and go from there, but urllib tries to open it explicitly, which > results in an HTTP error 400. > > Is "urllib" wrong?
I can't see how. HTTP 1.1 says that the parameter to the GET request should be an abs_path; RFC 2396 says that /../acatalog/shop.html is indeed an abs_path, as .. is a valid segment. That RFC also has a section on relative identifiers and normalization; it defines what .. means *in a relative path*. Section 4 is explicit about .. in absolute URIs: # The syntax for relative URI is a shortened form of that for absolute # URI, where some prefix of the URI is missing and certain path # components ("." and "..") have a special meaning when, and only when, # interpreting a relative path. Notice the "and only when": the browsers who modify above URL before sending it seem to be in clear violation of RFC 2396. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list