Lloyd Zusman wrote: > .... The -T and -B switches work as follows. The first block or so > .... of the file is examined for odd characters such as strange control > .... codes or characters with the high bit set. If too many strange > .... characters (>30%) are found, it's a -B file; otherwise it's a -T > .... file. Also, any file containing null in the first block is > .... considered a binary file. [ ... ]
That's a butt ugly heuristic that will lead to lots of false positives if your text happens to be UTF-16 encoded or non-english text UTF-8 encoded. Christian -- http://mail.python.org/mailman/listinfo/python-list