Dirk Hagemann wrote: > When I receive data from Microsoft Active Directory it is an >"ad_object" and has the type unicode. When I try to convert it to a > string I get this error: > UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in > position 26: ordinal not in range(128) > > This is caused by characters like the german ä, ö or ü. > > But I (think I) need this as a string. Is there a simple solution???
A Unicode string is also a string. If you want an 8-bit string, you need to decide what encoding you want to use. Common encodings are us-ascii (which is the default if you convert from unicode to 8-bit strings in Python), ISO-8859-1 (aka Latin-1), and UTF-8. For example, if you want Latin-1 strings, you can use one of. s = u.encode("iso-8859-1") # fail if some character cannot be converted s = u.encode("iso-8859-1", "replace") # instead of failing, replace with ? s = u.encode("iso-8859-1", "ignore") # instead of failing, leave it out If you want an ascii string, replace "iso-8859-1" above with "ascii". If you want to output the data to a web browser or an XML file, you can use import cgi s = cgi.escape(u).encode("ascii", "xmlcharrefreplace") </F>
-- http://mail.python.org/mailman/listinfo/python-list