In article <4c87013f$0$1625$742ec...@news.sonic.net>, John Nagle <na...@animats.com> wrote: > On 9/7/2010 5:43 PM, Terry Reedy wrote: > > On 9/7/2010 3:02 PM, John Nagle wrote: > >> There's a bug in Python 2.6's "urllib.urlencode". If you pass > >> in a Unicode character outside the ASCII range, instead of it > >> being encoded properly, an exception is raised. > >> > >> File "C:\python26\lib\urllib.py", line 1267, in urlencode > >> v = quote_plus(str(v)) > >> UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in > >> position 0: ordinal not in range(128) > >> > >> This will probably work in 3.x, because there, "str" converts > >> to Unicode, and quote_plus can handle Unicode. This is one of > >> those legacy bugs left from the pre-Unicode era. > >> > >> There's a workaround. Call urllib.urlencode with a second > >> parameter of 1. This turns on the optional feature of > >> accepting tuples in the argument to be encoded, and the > >> code goes through a newer code path that works. > >> > >> Is it worth reporting 2.x bugs any more? Or are we in the > >> version suckage period, where version N is abandonware and > >> version N+1 isn't deployable yet. > > > > You may report 2.7 bugs, but please verify that the behavior is a bug in > > 2.7. However, bugs that have been fixed by the switch to switch to > > unicode for text are unlikely to be fixed a second time in 2.7. You > > might suggest an enhancement to the doc for urlencode if that workaround > > is not clear. Or perhaps that workaround suggests that in this case, a > > fix would not be too difficult, and you can supply a patch. > > > > The basic deployment problem is that people who want to use unicode text > > also want to use libraries that have not been ported to use unicode > > text. That is the major issue for many porting projects. > > In other words, we're in the version suckage period.
It took me all of one minute to find where a similar issue was reported previously (http://bugs.python.org/issue1349732). One of the comments on the issue explains how to use the "doseq" form and an explicit encode to handle Unicode items. I don't see where that part of the suggestion made it into the documentation. I'm sure if you make a specific doc change suggestion, it will be incorporated into the 2.7 docs. If you think a code change is needed, suggest a specific patch. -- Ned Deily, n...@acm.org -- http://mail.python.org/mailman/listinfo/python-list