Pierre Quentel <pierre.quen...@gmail.com> added the comment: @Glenn "I'm curious what your system (probably Windows since you mention cp-) and browser, and HTTP server is, that you used for that test. Is it possible to capture the data stream for that test? Describe how, and at what stage the data stream was captured, if you can capture it. Most interesting would be on the interface between browser and HTTP server."
I tested it on Windows XP Family Edition 2020, Service Pack 3, with Python 3.2b2 Browsers : Mozilla Firefox 3.6.13 and Internet Explorer 7.0 Servers : Apache 2.2, and the built-in server started by : import http.server http.server.test(HandlerClass=http.server.CGIHTTPRequestHandler) I print the bytes received in the multipart/form-data part by "print(odelim+line)" at the end of method read_lines_to_outerboundary() of FieldStorage. The bytes sent when I enter the string "a"+"n tilde" + the euro sign are : b'a\xf1\x80' - that is, the cp-1252 encoding of the string Since it works the same with 2 browsers and 2 web servers, I'm almost sure it's not dependant on the configuration - but if others can tests on different configurations I'd like to know the result Basically, this behaviour is not surprising : if sys.stdin.encoding is set to a certain value, it's natural that the bytes sent on the binary layer are encoded with this encoding, not with latin-1 I attach the diff file for an updated version of cgi.py : - new argument stream_encoding instead of setting an attribute "encoding" to fp - use locale.getpreferredencoding() to decode the query string ---------- Added file: http://bugs.python.org/file20356/cgi_diff_20110111.txt _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue4953> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com