Atle Pedersen added the comment:
Just wanted to say thanks for very fast response, and informative information.
I respect your decision to close the bug as invalid. But my five cent is that
it still feels like a bug, something that shouldn't happen. Especially since
it's part of a very basic
Antoine Pitrou added the comment:
The file tree contains a file which has an undecodable character in it. It ends
up mangled as specified in PEP 383.
Priting such filenames is not directly supported (since they have invalid
characters in them), but you can workaround it in several ways, for ex
Ezio Melotti added the comment:
On Python 3, os.walk() uses the surrogateescape error handler. If the filename
is in e.g. iso-8859-* and the filesystem encoding is UTF-8, decoding '\xe5'
will then result in '\udce5', and '\udce5' can't then be printed because it's a
lone surrogate.
See also
New submission from Atle Pedersen :
I've made a short program to traverse file tree and print file names.
for root, dirs, files in os.walk(path):
for f in files:
hex = ' '.join(["%02X"%ord(x) for x in f])
print('file is',hex,f)
This fails with the followi