Christian Heimes wrote:
Lloyd Zusman wrote:
.... The -T  and -B  switches work as follows. The first block or so
.... of the file is examined for odd characters such as strange control
.... codes or characters with the high bit set. If too many strange
.... characters (>30%) are found, it's a -B file; otherwise it's a -T
.... file. Also, any file containing null in the first block is
.... considered a binary file. [ ... ]

That's a butt ugly heuristic that will lead to lots of false positives
if your text happens to be UTF-16 encoded or non-english text UTF-8 encoded.

...or non-English Latin-1 text...
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to