STINNER Victor <[email protected]> added the comment:
I tested cgi_32.patch on Windows with Apache:
- a test with a binary file works: I get a binary file instead of a text file
- a test with a non-ASCII character (a\xe9b) works: the text is correctly
decoded
I used the test script from full_source_and_error.zip.
Comments on cgi_32.patch:
- I don't understand why FieldStorage changes sys.stdout and sys.stderr (see
remarks about IOMix above): please remove the charset argument (it is also
confusing to have two encoding arguments). it should be done somewhere else
- please remove the O_BINARY hack: the patch is for Python 3.2 and I closed
issue #10841. If you would like a backport, another patch should be written
later
- "encoding = 'latin-1' # ?": write a real comment or remove it
- 'self.fp.read(...) # bytes': you should add a test on the type if you are
not sure that fp.read() gives bytes
- "file: the file(-like) object from which you can read the data *as bytes*":
you should mention that TextIOWrapper is also tolerated (accepted?)
- you may set fp directly to sys.stdin.buffer (instead of sys.stdin) if fp is
None (it will be easier after removing the O_BINARY thing)
- the patch adds a tab in an empty line, please don't do that :-)
- you should add a (private?) attribute to FieldStorage to decide if it works
on bytes or unicode, instead of using "self.filename is not None" test (eg.
self._use_bytes = (self.filename is not None)
- i don't like the idea of having a generic self.__write() method supporting
bytes and unicode. it would prefer two methods, eg. self.__write_text() and
self.__write_binary() (they can share a third private method)
- i don't like "stream_encoding" name: what is the "stream" here? do you
process a "file", a "string" or a "stream"? why not just "self.encoding"?
- "import email.parser,email.feedparser" one import is useless here. I prefer
"from email.feedparser import FeedParser" because you get directly a
ImportError if the symbol is missing. And it's already faster to get FeedParser
instead of email.feedparser.FeedParser in a loop (dummy micro-optimization)
- even I like the following change, please do it in a separated patch:
- if type(value) is type([]):
+ if isinstance(value,list):
I really don't like the IOMix thing:
- sys.stdout.write() should not accept bytes
- FieldStorage should not replace sys.stdout and sys.stderr: if you want to
set the encoding of these files, set PYTHONIOENCODING environment variable
before running your program (it changes also the encoding of sys.stdio)
- IOMix should not accept bytes *and* unicode. It's better to have an explicit
API like stdout.write('unicode') and stdout.buffer.write(b'bytes)
Most parts of the patch are correct and fix real bugs. Since cgi is broken
currently (eg. it doesn't handle binary files correctly), anything making the
situation better would be nice.
I vote +0 to commit the patch now (if the release manager agrees), and +1 if
all of my remarks are fixed.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue4953>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com