On Thu, 01 Oct 2009 08:10:58 -0700, Walter Dörwald <wal...@livinglogic.de>
wrote:
On 01.10.09 16:09, Hyuga wrote:
On Sep 30, 3:34 am, gentlestone <tibor.b...@hotmail.com> wrote:
Why don't work this code on Python 2.6? Or how can I do this job?
[snip _MAP]
def downcode(name):
"""
>>> downcode(u"Žabovitá zmiešaná kaša")
u'Zabovita zmiesana kasa'
"""
for key, value in _MAP.iteritems():
name = name.replace(key, value)
return name
Though C Python is pretty optimized under the hood for this sort of
single-character replacement, this still seems pretty inefficient
since you're calling replace for every character you want to map. I
think that a better approach might be something like:
def downcode(name):
return ''.join(_MAP.get(c, c) for c in name)
Or using string.translate:
import string
def downcode(name):
table = string.maketrans(
'ÀÁÂÃÄÅ...',
'AAAAAA...')
return name.translate(table)
Or even simpler:
import unicodedata
def downcode(name):
return unicodedata.normalize("NFD", name)\
.encode("ascii", "ignore")\
.decode("ascii")
Servus,
Walter
As I understand it, the "ignore" argument to str.encode *removes* the
undecodable characters, rather than replacing them with an ASCII
approximation. Is that correct? If so, wouldn't that rather defeat the
purpose?
--
Rami Chowdhury
"Never attribute to malice that which can be attributed to stupidity" --
Hanlon's Razor
408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
--
http://mail.python.org/mailman/listinfo/python-list