Re: os.walk the apostrophe and unicode

Peter Otten Sat, 24 Jun 2017 12:38:32 -0700

Rod Person wrote:

> Hi,
> 
> I'm working on a program that will walk a file system and clean the id3
> tags of mp3 and flac files, everything is working great until the
> follow file is found
> 
> '06 - Todd's Song (Post-Spiderland Song in Progress).flac'
> 
> for some reason that I can't understand os.walk() returns this file
> name as
> 
> '06 - Todd\xe2\x80\x99s Song (Post-Spiderland Song in Progress).flac'
> 
> which then causes more hell than a little bit for me. I'm not
> understand why apostrophe(') becomes \xe2\x80\x99, or what I can do
> about it.


>>> b"\xe2\x80\x99".decode("utf-8")
'’'
>>> unicodedata.name(_)
'RIGHT SINGLE QUOTATION MARK'

So it's '’' rather than "'".

> The script is Python 3, the file system it is running on is a hammer
> filesystem on DragonFlyBSD. The audio files reside on a QNAP NAS which
> runs some kind of Linux so it probably ext3/4. The files came from
> various system (Mac, Windows, FreeBSD).

There seems to be a mismatch between the assumed and the actual file system 
encoding somewhere in this mix. Is this the only glitch or are there similar 
problems with other non-ascii characters?

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: os.walk the apostrophe and unicode

Reply via email to