Re: Python 3.2 has some deadly infection

Chris Angelico Tue, 03 Jun 2014 09:46:44 -0700

On Wed, Jun 4, 2014 at 2:34 AM, Steven D'Aprano
<[email protected]> wrote:
> Outside of those three kinds of files, I would expect that *by far* the
> single largest kind of file is text. Some text is wrapped in a binary
> layer, e.g. .doc, .odt, etc. but an awful lot of it is good old human
> readable text, including web pages (html) and XML.


In terms of file I/O in Python, text wrapped in a binary layer has to
be treated as binary, not text. There's no difference between a JPEG
file that has some textual EXIF information and an ODT file that's a
whole lot of zipped up text; both of them have to be read as binary,
then unpacked according to the container's specs, and then the text
portion decoded according to an encoding like UTF-8.

But you're quite right that a large proportion of files out there
really are text.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python 3.2 has some deadly infection

Reply via email to