Martin v. Löwis wrote: > gabor schrieb: >> 1. simply fix the documentation, and state that if the file-name cannot >> be decoded into unicode, then it's returned as byte-string. > > For 2.5, this should be done. Contributions are welcome. > > [...then] >> [os.path.join(path,n) for n in os.listdir(path)] >> >> will not work. >> >> 2. add support for some unicode-decoding flags, like i wrote before > > I may have missed something, but did you present a solution that would > make the case above work?
if we use the same decoding flags as binary-string.decode(), then we could do: [os.path.join(path,n) for n in os.listdir(path,'ignore')] or [os.path.join(path,n) for n in os.listdir(path,'replace')] it's not an elegant solution, but it would solve i think most of the problems. > >> 3. some solution. > > One approach I had been considering is to always make the decoding > succeed, by using the private-use-area of Unicode to represent bytes > that don't decode correctly. > hmm..an interesting idea.. and what happens with such texts, when they are encoded into let's say utf-8? are the in-private-use-area characters ignored? gabor -- http://mail.python.org/mailman/listinfo/python-list