On 09/08/18 01:48, MRAB wrote: > On 2018-08-08 23:16, Thomas Jollans wrote: >> On *nix, file names are bytes. In real life, we prefer to think of file >> names as strings. How non-ASCII file names are created is determined by >> the locale, and on most systems these days, every locale uses UTF-8 and >> everybody's happy. Of course this doesn't mean you'll never run into and >> old directory tree from the pre-UTF8 age using some other encoding, and >> it doesn't prevent people from doing silly things in file names. >> >> Python deals with this tolerably well: by convention, file names are >> strings, but you can use bytes for file names if you wish. The docs [1] >> warn you about the situation. >> >> [1] https://docs.python.org/3/library/os.path.html >> >> If Python runs into a non-UTF8 (better: non-decodable) file name and has >> to return a str, it uses surrogate escape codes. So far so good. Right? >> >> This leads to the unfortunate situation that you can't always print() >> file names, as print() is strict and refuses to toy with surrogates. >> >> To be more explicit, the script >> >> print(__file__) >> >> will fail depending on the file name. This feels wrong... (though every >> bit of behaviour is correct) >> >> (The situation can't arise on Windows, and Python 2 will pretend nothing >> happened in true UNIX style) >> >> Demo script to try at home below. >> > [snip] > > Is it true that Unix filenames can contain control characters, e.g. \x07? > > When happens when you print them out? > > I think it's not just a problem with surrogate escapes.
Not a problem (or: not an exception), as those are ASCII and thus UTF-8. Python 3.6.5 (default, Apr 1 2018, 05:46:30) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> with open('\x07.py', 'w') as fp: ... fp.write('print(__file__)\n') ... 16 >>> import sys; import subprocess >>> subprocess.call([sys.executable, '\x07.py']) .py 0 >>> As you might expect, it beeped when printing '\x07.py' (and showed .py) -- https://mail.python.org/mailman/listinfo/python-list