Chris Angelico <ros...@gmail.com>: > If you really REALLY can't use the bytes() type to work with something > that is, yaknow, bytes, then you could use an alternative encoding > that has a value for every byte. It's still not Unicode text, so it > doesn't much matter which encoding you use. But it's much better to > use the bytes type to work with bytes. It is not text, so don't treat > it as text.
See: $ mkdir /tmp/xyz $ touch /tmp/xyz/$'\x80' $ python3 Python 3.3.2 (default, Dec 4 2014, 12:49:00) [GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.listdir('/tmp/xyz') ['\udc80'] >>> open(os.listdir('/tmp/xyz')[0]) Traceback (most recent call last): File "<stdin>", line 1, in <module> FileNotFoundError: [Errno 2] No such file or directory: '\udc80' File names encoded with Latin-X are quite commonplace even in UTF-8 locales. Marko -- https://mail.python.org/mailman/listinfo/python-list