Re: convert unicode characters to visibly similar ascii characters

Terry Reedy Tue, 01 Jul 2008 12:48:31 -0700


Peter Bulychev wrote:

Hello.

I want to convert unicode character into ascii one.
The method ".encode('ASCII') " can convert only those unicodecharacters, which fit into 0..128 range.
But there are still lots of characters beyond this range, which can bemanually converted to some visibly similar ascii characters. Forinstance, there are several quotation marks in unicode, which can beconverted into ascii quotation mark.
Can this conversion be performed in automatic manner? After googlingI've only found that there exists Unicode database, which storeshuman-readable information on notation of all unicode characters(ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt). And there alsoexists the Python adapter for this database(http://docs.python.org/lib/module-unicodedata.html). Using thisdatabase I can do something like `ifnotation.find('QUOTATION')!=-1:\n\treturn "'"`. I believe there is moreelegant way. Am I right?

I believe you will have to make up your own translation dictionary forthe translations *you* want. You should then be able to use that withthe .translate() method.


tjr

--
http://mail.python.org/mailman/listinfo/python-list

Re: convert unicode characters to visibly similar ascii characters

Reply via email to