Antoine Pitrou <[EMAIL PROTECTED]> added the comment:

> Just to be clear, I am at present totally confused about io streams :-)

Python 3.0 distincts more clearly between unicode strings (called "str"
in 3.0) and bytes strings (called "bytes" in 3.0). The most important
point being that there is no more any implicit conversion between the
two: you must explicitly use .encode() or .decode().

Files opened in binary ("rb") mode returns byte strings, but files
opened in text ("r") mode return unicode strings, which means you can't
give a text file to 3.0 library expecting a binary file, or vice-versa.

What is more worrying is that XML, until decoded, should be considered a
byte stream, so sax.parser should accept binary files rather than text
files. I took a look at test_sax and indeed it considers XML as text
rather than bytes :-(

Bumping this as critical because it needs a decision very soon (ideally
before beta3).

----------
nosy: +pitrou
priority:  -> critical
title: sax.parser hangs on byte streams -> sax.parser considers XML as text 
rather than bytes

_______________________________________
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3590>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to