Dear all, could somebody please just put an end to the unicode mysery I'm in, men... The situation is that I have a Tkinter program that let's the user enter data in some Entries and this data needs to be transformed to the encoding compatible with an .rtf-file. In fact I only need to do some of the usual symbols like ë etc.
Here's the function that I am using: def pythonUnicodeToRTFAscii(self,s): if isinstance(s,str): return s s_str=repr(s.encode('UTF-8')) replDic={'\xc3\xa0':"\\'e0",'\xc3\xa4':"\\'e4",'\xc3\xa1':"\ \'e1", '\xc3\xa8':"\\'e8",'\xc3\xab':"\\'eb",'\xc3\xa9':"\ \'e9", '\xc3\xb2':"\\'f2",'\xc3\xb6':"\\'f6",'\xc3\xb3':"\ \'f3", '\xe2\x82\xac':"\\'80"} for k in replDic.keys(): if repr(k) in s_str: s_str=s_str.replace(repr(k),replDic[k]) return s_str So replDic represents the mapping from one encoding to the other. Now, if I enter e.g. 'Arjën' in the Entry, then s_str in the above function becomes 'Arj\xc3\xabn' and since replDic contains the key \xc3\xab I would expect the replacement in the final lines of the function to kick in. This however doesn't happen, there's no match. However interactive: >>> '\xc3\xab' in 'Arj\xc3\xabn' True I just don't get it, what's the difference? Is the above anyhow the best way to attack such a problem? Thanks & best wishes, Kees -- http://mail.python.org/mailman/listinfo/python-list