New submission from Emil Stenström:

This issue is in response to this thread on python-ideas: 
https://mail.python.org/pipermail/python-ideas/2016-January/037678.html

Note that Cory did a lot of encoding background work here:
https://mail.python.org/pipermail/python-ideas/2016-January/037680.html

---
Bug description:

When posting an unencoded unicode string directly with python-requests you get 
the following stacktrace:

import requests
r = requests.post("http://example.com";, data="Celebrate 🎉") 
...
  File "../lib/python3.4/http/client.py", line 1127, in _send_request
    body = body.encode('iso-8859-1')
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 14-15: 
ordinal not in range(256) 

This is because requests uses http.client, and http.client assumes the encoding 
to be latin-1 if given a unicode string. This is a very common source of bugs 
for beginners who assume sending in unicode would automatically encode it in 
utf-8, like in the libraries of many other languages.

The simplest fix here is to catch the UnicodeEncodeError and improve the error 
message to something that points beginners in the right direction.

Another option would be to:
- Keep encoding in latin-1 first, and if that fails try utf-8

Other possible solutions (that would be backwards incompatible) includes:
- Changing the default encoding to utf-8 instead of latin-1
- Detect an unencoded unicode string and fail without encoding it with a 
descriptive error message

---

Just to show that this is a problem that exists in the wild, here are a few 
examples that all crashes on the same line in http.client (not all going 
through the requests library:

- https://github.com/kennethreitz/requests/issues/2838
- https://github.com/kennethreitz/requests/issues/1822
- 
http://stackoverflow.com/questions/34618149/post-unicode-string-to-web-service-using-python-requests-library
- 
https://www.reddit.com/r/learnpython/comments/3violw/unicodeencodeerror_when_searching_ebay_with/
- https://github.com/codecov/codecov-python/issues/35
- https://github.com/google/google-api-python-client/issues/145
- https://bugs.launchpad.net/ubuntu/+source/lazr.restfulclient/+bug/1414063

----------
components: Unicode
messages: 257721
nosy: Emil Stenström, ezio.melotti, haypo
priority: normal
severity: normal
status: open
title: Improve error message for http.client when posting unicode string
type: enhancement
versions: Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26045>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to