On Jul 2, 9:55 am, Jim <[EMAIL PROTECTED]> wrote: > Peter Bulychev wrote: > > I want to convert unicode character into ascii one. > > You have to make some arbitrary choices of what to translate. Based > on some materials on effbot's site, and a recipe, I made > ftp://alan.smcvt.edu/hefferon/unicode2ascii.py > which has at least some of what you are looking for. > $ grep HYPHEN unicode2ascii.py > u'\N{SOFT HYPHEN}':u'-', > u'\N{HYPHEN}':u'-', > u'\N{NON-BREAKING HYPHEN}':u'-', > u'\N{SOFT HYPHEN}': '-', > No doubt I have some terrible gaffes and some things missing. > Corrections appreciated.
Comments on the above grep output: 1. You have SOFT HYPHEN twice, mapping it to u'-' and '-' 2. The idea of a soft hyphen is as a hint to a hyphenator about where to insert a hyphen if one is necessary and the hyphenator is suspected of acting cluelessly without the hint. IMHO, asciification should substitute u'', not u'-'. 3. Read PEP 8. s/:/: / Cheers, John -- http://mail.python.org/mailman/listinfo/python-list