> So why is it that in the first case I got UnicodeEncodeError: 'ascii' > codec can't encode? Seems as if, within Idle, a utf-8 codec is being > selected automagically... why should that be so there and not in the > first case?
I'm a bit confused on what you did when.... the error appears if you try to output a unicode-object without prior encoding - then the default encoding (ascii) is used. >>> Then, in the hope of being able to write the string to a file if not to >>> stdout, I also tried >>> >>> >>> import codecs >>> f = codecs.open("out.txt", "w", "utf-8") >>> f.write(s2) >>> >>> but got >>> >>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: >>> ordinal not in range(128) >> >> Instead of writing s2 (which is a byte-string!!!), write s1. It will >> work. > > OK, many thanks, I got this to work! > >> The error you get stems from f.write wanting a unicode-object, but s2 is >> a bytestring (you explicitly converted it before), so python tries to >> encode the bytestring with the default encoding - ascii - to a unicode >> string. This of course fails. > > I think I have a better understanding of it now. If the terminal hadn't > fooled me, I probably wouldn't have assumed that the code I originally > wrote (following the first examples I found) was wrong! I assume that > when you say "bytestring" you mean "a string of bytes in a certain > encoding (here utf-8) that can be used as an external representation for > the unicode string which is instead a sequence of code points". Yes. That is exactly the difference. Diez -- http://mail.python.org/mailman/listinfo/python-list