A simple unicode question. How do I print? Sample code:
# -*- coding: utf-8 -*- s1 = u"héllô wórld" print s1 # Gives UnicodeEncodeError: 'ascii' codec can't encode character # u'\xe9' in position 1: ordinal not in range(128) What I actually want to do is slightly more elaborate: read from a text file which is in utf-8, do some manipulations of the text and print the result on stdout. I understand I must open the file with f = codecs.open("input.txt", "r", "utf-8") but then I get stuck as above. I tried s2 = s1.encode("utf-8") print s2 but got héllô wórld Then, in the hope of being able to write the string to a file if not to stdout, I also tried import codecs f = codecs.open("out.txt", "w", "utf-8") f.write(s2) but got UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128) So I seem to be stuck. I have checked several online python+unicode pages, including http://boodebr.org/main/python/all-about-python-and-unicode#WHYNOPRINT http://evanjones.ca/python-utf8.html http://www.reportlab.com/i18n/python_unicode_tutorial.html http://www.amk.ca/python/howto/unicode http://www.example-code.com/python/python-charset.asp http://docs.python.org/lib/csv-examples.html but none of them was sufficient to make me understand how to deal with this simple problem. I'm sure it's easy, maybe too easy to be worth explaining in a tutorial... Help gratefully received. -- http://mail.python.org/mailman/listinfo/python-list