[issue20409] .readline() returned garble text

R. David Murray Mon, 27 Jan 2014 09:29:11 -0800

R. David Murray added the comment:

The file use different encodings.  In the first case, the first two bytes 
(which don't appear in the second example) I believe are the BOM.  I'm not an 
expert, but I believe it is a utf-16 file (thus all the \x00 bytes).  The 
second file is presumably utf-8, with no BOM.  Notepad++ handles both 
automatically.  For Python, you have to tell it to look for the BOM by 
specifying the appropriate codec in the open call.  This is because Python's 
philosophy is to not guess at the encoding of files (though it does have a 
default encoding, usually utf-8).


Questions like this are better directed to the python-list mailing list, by the 
way.

----------
nosy: +r.david.murray
resolution:  -> invalid
stage:  -> committed/rejected
status: open -> closed

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue20409>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue20409] .readline() returned garble text

Reply via email to