Pierre Quentel <pierre.quen...@gmail.com> added the comment:

@Glenn
"I'm curious what your system (probably Windows since you mention cp-) and 
browser, and HTTP server is, that you used for that test.  Is it possible to 
capture the data stream for that test?  Describe how, and at what stage the 
data stream was captured, if you can capture it.  Most interesting would be on 
the interface between browser and HTTP server."

I tested it on Windows XP Family Edition 2020, Service Pack 3, with Python 3.2b2
Browsers : Mozilla Firefox 3.6.13 and Internet Explorer 7.0
Servers : Apache 2.2, and the built-in server started by :

import http.server
http.server.test(HandlerClass=http.server.CGIHTTPRequestHandler)

I print the bytes received in the multipart/form-data part by 
"print(odelim+line)" at the end of method  read_lines_to_outerboundary() of 
FieldStorage. The bytes sent when I enter the string 
    "a"+"n tilde" + the euro sign 
are : b'a\xf1\x80' - that is, the cp-1252 encoding of the string

Since it works the same with 2 browsers and 2 web servers, I'm almost sure it's 
not dependant on the configuration - but if others can tests on different 
configurations I'd like to know the result

Basically, this behaviour is not surprising : if sys.stdin.encoding is set to a 
certain value, it's natural that the bytes sent on the binary layer are encoded 
with this encoding, not with latin-1

I attach the diff file for an updated version of cgi.py :
- new argument stream_encoding instead of setting an attribute "encoding" to fp
- use locale.getpreferredencoding() to decode the query string

----------
Added file: http://bugs.python.org/file20356/cgi_diff_20110111.txt

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue4953>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to