> s_str=repr(s.encode('UTF-8')) It would be easier to encode this in cp1252 here, as this is apparently the encoding that you want to use in the RTF file, too. You could then loop over the string, replacing all bytes >= 128 with \\'%.2x
As yet another alternative, you could create a Unicode error handler (call it 'rtf'), and then do return s.encode('ascii', errors='rtf') > replDic={'\xc3\xa0':"\\'e0",'\xc3\xa4':"\\'e4",'\xc3\xa1':"\ > \'e1", > '\xc3\xa8':"\\'e8",'\xc3\xab':"\\'eb",'\xc3\xa9':"\ > \'e9", > '\xc3\xb2':"\\'f2",'\xc3\xb6':"\\'f6",'\xc3\xb3':"\ > \'f3", > '\xe2\x82\xac':"\\'80"} > for k in replDic.keys(): > if repr(k) in s_str: > s_str=s_str.replace(repr(k),replDic[k]) > return s_str > > However interactive: > >>>> '\xc3\xab' in 'Arj\xc3\xabn' > True > > I just don't get it, what's the difference? It's the repr(): py> '\xc3\xab' in 'Arj\xc3\xabn' True py> repr('\xc3\xab') in repr('Arj\xc3\xabn') False py> repr('\xc3\xab') "'\\xc3\\xab'" py> repr('Arj\xc3\xabn') "'Arj\\xc3\\xabn'" repr('\xc3\xab') starts with an apostrophe, which doesn't appear before the \\xc3 in repr('Arj\xc3\xabn'). HTH, Martin -- http://mail.python.org/mailman/listinfo/python-list