[issue3300] urllib.quote and unquote - Unicode issues

Antoine Pitrou Wed, 13 Aug 2008 10:53:01 -0700

Antoine Pitrou <[EMAIL PROTECTED]> added the comment:

Le mercredi 13 août 2008 à 17:05 +0000, Bill Janssen a écrit :
> I think it's worth remembering that a very large proportion of the use
> of Python's urllib.unquote() is in implementations of Web server
> frameworks of one sort or another.  We can't control what the browsers
> that talk to such frameworks produce;


Yes, we do. Browsers will use whatever charset is specified in the HTML
for the query part; and, as for the path part, they should't produce it
themselves, they just follow a link which should already be
percent-quoted in the HTML.

(URL rewriting at the HTTP server level can make this more complicated,
since it can turn a query fragment into a path fragment or vice-versa;
however, most modern frameworks alleviate the need for such rewriting,
since they allow to specify flexible mapping rules at the framework
level)

The situation in which we can't control the encoding is when getting the
URLs from third-part content (e.g. some Web page which we didn't produce
ourselves, or some link in an e-email). But in those cases there's less
use cases for unquoting the URL rather than use it as-is. The only time
I've wanted to unquote such an URL was to do some processing of HTTP
referrers in order to extract which search queries had led people to
visit a Web site.

_______________________________________
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3300>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3300] urllib.quote and unquote - Unicode issues

Reply via email to