On Wed, Jul 16, 2014 at 10:10 PM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > Linux, like all Unixes, is primarily a text-based platform. With a few > exceptions, /etc is filled with text files, not binary files, and half > the executables on the system are text (Python, Perl, bash, sh, awk, > etc.).
An interesting assertion. I know "half" is not meant to be an actual estimate, but out of curiosity, I whipped up a quick script to figure out just how many of my executables are text and how many aren't. #!/usr/bin/env python3 import os, subprocess text = binary = unknown = unreadable = 0 for path in os.environ["PATH"].split(":"): for file in os.listdir(path): fn = os.path.join(path, file) try: t = subprocess.check_output(["file", "-L", fn]) except subprocess.CalledProcessError: print("Unreadable: %s" % fn) unreadable += 1 continue if isinstance(t, bytes): t = t.decode("ascii") # Now to try to figure out what's text and what's binary. if "text" in t: # Most Unixes follow the convention of having "text" in # the output of all files that can be safely blatted to # a terminal - for instance, "ASCII text executable" is # used to describe most shell scripts etc; this file is # a "Python script, ASCII text executable". If I put in # a non-ASCII char, the 'file' descr becomes changes to # "Python script, UTF-8 Unicode text executable". text += 1 elif "directory" in t: # Ignore directories. pass elif "LSB executable" in t or "LSB shared object" in t: binary += 1 else: print(t.strip()) unknown += 1 print("%d text, %d binary" % (text, binary)) if unknown: print("Also %d unknowns, which are probably binary." % unknown) if unreadable: print("Plus %d files that couldn't be read." % unreadable) On my system, it says: rosuav@sikorsky:~$ python3 exectypes.py /usr/local/bin/youtube-dl: data Unreadable: /usr/bin/wine-safe /usr/bin/mptopdf: LaTeX auxiliary file, /usr/bin/gvfs-less: Palm OS dynamic library data "#!/bin/sh" Unreadable: /usr/bin/gserialver 1140 text, 2060 binary Also 3 unknowns, which are probably binary. Plus 2 files that couldn't be read. So a bit more than a third of my executables are text. That's a pretty high proportion, and not very far off the rough guesstimate of half. (And I tried this on three other Linuxes I have around the house, getting broadly the same proportion, although the numbers are quite different.) ChrisA -- https://mail.python.org/mailman/listinfo/python-list