Re: utf - string translation

Fredrik Lundh Wed, 29 Nov 2006 14:13:48 -0800

John Machin wrote:

> Another point: there are many non-latin1 characters that could be
> mapped to ASCII. For example:
>     u"\u0141ukasziewicz".translate(unaccented_map())
> doesn't work unless an entry is added to the no-decomposition table:
>     0x0141: u"L", # LATIN CAPITAL LETTER L WITH STROKE
> 
> It looks like generating extra entries like that could be done, with
> the aid of unicodedata.name():
> 
> LATIN CAPITAL LETTER X WITH blahblah -> "X"
> LATIN SMALL LETTER X WITH blahblah -> "X".lower()
> 
> This would require a fair bit of care -- obviously there are special
> cases like LATIN CAPITAL LETTER O WITH STROKE. Eyeballing by regional
> experts is probably required.


see the comments over at

     http://effbot.org/zone/unicode-convert.htm

for an extended table, eyeballed by a regional expert (and since he 
makes the same point about OE vs Oe as you do, I'll probably have to 
change the code ;-)

</F>

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: utf - string translation

Reply via email to