Skip Montanaro added the comment:

Thanks. The display you showed looks about like I saw in LibreOffice. If you 
export it back to another CSV file, does the new file match the original 
exactly, or does (like LibreOffice) it save a file without NUL bytes?

I don't mind having the discussion, but even though we have traditionally 
treated CSV files as binary files in Python (at least when I was closely 
involved in the 2.x days), that was mostly so end-of-line sequences weren't 
corrupted. As others have pointed out, in 2.x Python String objects stored the 
data as a normal NUL-terminated pointer-to-char for efficiency when interacting 
with C libraries. C uses NUL as a string terminator, so we couldn't work with 
embedded NULs. I haven't looked at the 3.x string stuff (I know Unicode is much 
more intimately involved). If it still maintains that close working 
relationship with the typical C strings, supporting NUL bytes will be 
problematic.

In cases where the underlying representation isn't quite what I want, I've been 
able to get away with a file wrapper which suitably mangles the input before 
passing it up the chain to the csv module. For example, the __next__ method of 
your file wrapper could delete NULs or replace them with something suitably 
innocuous, like "\001", or some other non-printable character you are certain 
won't be in the input. If you want to preserve NULs, reverse the translation 
during the write().

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27580>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to