New submission from Dan Mahn <dan.m...@digidescorp.com>:

urllib.parse.urlencode() uses quote_plus() extensively to create a
complete query string, but doesn't effectively/properly take advantage
of the flexibility built into quote_plus().  Namely:

1) Instances of type "bytes" are not properly encoded, as str() is used
prior to passing to quote_plus().  This creates a nonsensical string
such as b'1234', while quote_plus() can handle these types properly if
passed intact.  The ability to encode this type is particularly useful
for putting binary data into the query string, or for pre-encoded text
which you may want to encode in a non-standard character encoding.

2) Sometimes it would be desirable to encode query strings entirely in
"latin-1" or possibly "ascii" instead of "utf-8".  Adding the extra
parameters now present on quote_plus() can easily give that extra
functionality.

I have attached a new version of urlencode() that provides both of the
above fixes/enhancements.  Additionally, an unused codepath in the
existing function has been eliminated/cleaned up.  Some doctests are
included as well.

----------
components: Library (Lib)
files: new_urlencode.py
message_count: 1.0
messages: 83434
nosy: dmahn
nosy_count: 1.0
severity: normal
status: open
title: urlencode does not handle "bytes", and could easily handle alternate 
encodings
type: behavior
versions: Python 3.0, Python 3.1
Added file: http://bugs.python.org/file13294/new_urlencode.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue5468>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to