On Fri, Feb 12, 2010 at 6:01 AM, Lloyd Zusman <l...@asfast.com> wrote: > Perl has the following constructs to check whether a file is considered > to contain "text" or "binary" data: > > if (-T $filename) { print "file contains 'text' characters\n"; } > if (-B $filename) { print "file contains 'binary' characters\n"; } > > Is there already a Python analog to these? I'm happy to write them on > my own if no such constructs currently exist, but before I start, I'd > like to make sure that I'm not "re-inventing the wheel". > > By the way, here's what the perl docs say about these constructs. I'm > looking for something similar in Python: > > ... The -T and -B switches work as follows. The first block or so > ... of the file is examined for odd characters such as strange control > ... codes or characters with the high bit set. If too many strange > ... characters (>30%) are found, it's a -B file; otherwise it's a -T > ... file. Also, any file containing null in the first block is > ... considered a binary file. [ ... ]
Pray tell, what are the circumstances that lead you to use such a heuristic rather than a more definitive method? Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list