My headache is growing while playing arround with unicode in Python, please help this novice. I have chosen to divide my problem into a few questions.
Python 2.3.4 (#1, Feb 2 2005, 12:11:53) [GCC 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)] on linux2 1) Does " >>>print 'hello' " simply write to sys.stdout? 2) Exactly what does the following line return? >>> sys.stdout.encoding 'ISO-8859-1' Is it the encoding of the terminal? I think not, because when I change the encoding in my terminal the result is still the same. Is it the encoding of the string python "hands over" to the terminal? I think not. In the following code i am pretty confident that the second command changes that, and still sys.stdout.encoding is the same value. >>> import sys,codecs >>> sys.stdout.encoding 'ISO-8859-1' >>> sys.stdout = codecs.getwriter('utf-8')(sys.stdout) >>> sys.stdout.encoding 'ISO-8859-1' Then what? 3) Does raw_input() come from sys.stdin? 4) The following script is not working, can you please tell me how to do it right. >>> import codecs,sys >>> sys.stdout = codecs.getwriter('utf-8')(sys.stdout) >>> sys.stdin = codecs.getreader('utf-8')(sys.stdin) >>> x = raw_input('write this unicode letter, Turkish che, unicode 0x00E7\t') write this unicode letter, Turkish che, unicode 0x00E7 ç Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/lib/python2.3/codecs.py", line 295, in readline return self.decode(line, self.errors)[0] UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1: unexpected end of data When prompted, I simply enter the che with my Turkish keyboard layout. velle, Denmark -- http://mail.python.org/mailman/listinfo/python-list