Re: small inconsistency in ElementTree (1.2.6)

Damjan Sat, 10 Dec 2005 08:21:50 -0800

>>> ascii strings and unicode strings are perfectly interchangable, with
>>> some minor exceptions.
>>
>> It's not only translate, it's decode too...
>
> why would you use decode on the strings you get back from ET ?


Long story... some time ago when computers wouldn't support charsets
people
invented so called "cyrillic fonts" - ie a font that has cyrillic
glyphs
mapped on the latin posstions. Since our cyrillic alphabet has 31
characters, some characters in said fonts were mapped to { or ~ etc..
Of
course this ,,sollution" is awful but it was the only one at the
time.

So I'm making a python script that takes an OpenDocument file and
translates
it to UTF-8...

ps. I use translate now, but I was making a general note that unicode
and
string objects are not 100% interchangeable. translate, encode, decode
are
especially problematic.

anyway, I wrap the output of ET in unicode() now... I don't see
another, better, sollution.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: small inconsistency in ElementTree (1.2.6)

Reply via email to