Re: Bug in Python 2.6 urlencode

Terry Reedy Tue, 07 Sep 2010 17:46:21 -0700

On 9/7/2010 3:02 PM, John Nagle wrote:

  There's a bug in Python 2.6's "urllib.urlencode".  If you pass
in a Unicode character outside the ASCII range, instead of it
being encoded properly, an exception is raised.


File "C:\python26\lib\urllib.py", line 1267, in urlencode
v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in
position 0: ordinal not in range(128)

This will probably work in 3.x, because there, "str" converts
to Unicode, and quote_plus can handle Unicode. This is one of
those legacy bugs left from the pre-Unicode era.

There's a workaround. Call urllib.urlencode with a second
parameter of 1. This turns on the optional feature of
accepting tuples in the argument to be encoded, and the
code goes through a newer code path that works.

Is it worth reporting 2.x bugs any more? Or are we in the
version suckage period, where version N is abandonware and
version N+1 isn't deployable yet.

You may report 2.7 bugs, but please verify that the behavior is a bug in2.7. However, bugs that have been fixed by the switch to switch tounicode for text are unlikely to be fixed a second time in 2.7. Youmight suggest an enhancement to the doc for urlencode if that workaroundis not clear. Or perhaps that workaround suggests that in this case, afix would not be too difficult, and you can supply a patch.

The basic deployment problem is that people who want to use unicode textalso want to use libraries that have not been ported to use unicodetext. That is the major issue for many porting projects.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Bug in Python 2.6 urlencode

Reply via email to