Rod Person wrote: > Hi, > > I'm working on a program that will walk a file system and clean the id3 > tags of mp3 and flac files, everything is working great until the > follow file is found > > '06 - Todd's Song (Post-Spiderland Song in Progress).flac' > > for some reason that I can't understand os.walk() returns this file > name as > > '06 - Todd\xe2\x80\x99s Song (Post-Spiderland Song in Progress).flac' > > which then causes more hell than a little bit for me. I'm not > understand why apostrophe(') becomes \xe2\x80\x99, or what I can do > about it.
>>> b"\xe2\x80\x99".decode("utf-8") '’' >>> unicodedata.name(_) 'RIGHT SINGLE QUOTATION MARK' So it's '’' rather than "'". > The script is Python 3, the file system it is running on is a hammer > filesystem on DragonFlyBSD. The audio files reside on a QNAP NAS which > runs some kind of Linux so it probably ext3/4. The files came from > various system (Mac, Windows, FreeBSD). There seems to be a mismatch between the assumed and the actual file system encoding somewhere in this mix. Is this the only glitch or are there similar problems with other non-ascii characters? -- https://mail.python.org/mailman/listinfo/python-list