codecs.open on Win32 -- converting my newlines to CR+LF

Ryan McGuire Wed, 26 Aug 2009 19:56:28 -0700

I've got a UTF-8 encoded text file from Linux with standard newlines
("\n").


I'm reading this file on Win32 with Python 2.6:

codecs.open("whatever.txt","r","utf-8").read()

Inexplicably, all the newlines ("\n") are replaced with CR+LF ("\r
\n") ... Why?

As a workaround I'm having to do this:

open("whatever.txt","r").read().decode("utf-8")

which appropriately does not alter my newlines.

What really gets me confused though is the Python docs for
codecs.open:

"Files are always opened in binary mode, even if no binary mode was
specified. This is done to avoid data loss due to encodings using 8-
bit values. This means that no automatic conversion of '\n' is done on
reading and writing."

The way I read that, codecs.open should not touch my newlines. What am
I doing wrong? Is this a bug in Python, or in the docs, or both?
-- 
http://mail.python.org/mailman/listinfo/python-list

codecs.open on Win32 -- converting my newlines to CR+LF

Reply via email to