Cool! Thanks to both Iliya and Peter! On May 6, 7:34 pm, Peter Otten <__pete...@web.de> wrote: > coldpizza wrote: > > Hello, > > > I need to convert accented unicode chars in some audio files to > > similarly-looking ascii chars. Looks like the following code seems to > > work on windows: > > > import os > > import sys > > import glob > > > EXT = '*.*' > > > lst_uni = glob.glob(unicode(EXT)) > > > os.system('chcp 437') > > lst_asci = glob.glob(EXT) > > print sys.stdout.encoding > > > for i in range(len(lst_asci)): > > try: > > os.rename(lst_uni[i], lst_asci[i]) > > except Exception as e: > > print e > > > On windows it converts most of the accented chars from the latin1 > > encoding. This does not work in Linux since it uses 'chcp'. > > > The questions are (1) *why* does it work on windows, and (2) what is > > the proper and portable way to convert unicode characters to similarly > > looking plain ascii chars? > > > That is how to properly do this kind of conversion? > > ü > u > > é > e > > â > a > > ä > a > > à > a > > á > a > > ç > c > > ê > e > > ë > e > > è > e > > > Is there any other way apart from creating my own char replacement > > table? > >>> from unicodedata import normalize > >>> s = u"""ü > u > > ... é > e > ... â > a > ... ä > a > ... à > a > ... á > a > ... ç > c > ... ê > e > ... ë > e > ... è > e > ... """>>> from unicodedata import normalize > >>> print normalize("NFD", s).encode("ascii", "ignore") > > u > u > e > e > a > a > a > a > a > a > a > a > c > c > e > e > e > e > e > e
-- http://mail.python.org/mailman/listinfo/python-list