On Fri, Jan 06, 2017 at 02:54:49AM +0100, Victor Stinner wrote: > Let's say that you have the filename b'nonascii\xff': it's decoded as > 'nonascii\xdcff' by the UTF-8 mode. How do GUIs handle such filename? > (I don't know the answer, it's a real question ;-))
I ran this in Python 2.7 to create the file: open(b'/tmp/nonascii\xff-', 'w') and then confirmed the filename: [steve@ando tmp]$ ls -b nonascii* nonascii\377- Konquorer in KDE 3 displays it with *two* "missing character" glyphs (small hollow boxes) before the hyphen. The KDE "Open File" dialog box shows the file with two blank spaces before the hyphen. My interpretation of this is that the difference is due to using different fonts: the file name is shown the same way, but in one font the missing character is a small box and in the other it is a blank space. I cannot tell what KDE is using for the invalid character, if I copy it as text and paste it into a file I just get the original \xFF. The Geany text editor, which I think uses the same GUI toolkit as Gnome, shows the file with a single "missing glyph" character, this time a black diamond with a question mark in it. It looks like Geany (Gnome?) is displaying the invalid byte as U+FFFD, the Unicode "REPLACEMENT CHARACTER". So at least two Linux GUI environments are capable of dealing with filenames that are invalid UTF-8, in two different ways. Does this answer your question about GUIs? -- Steve _______________________________________________ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
