I think it helped me very much to understand the problem. So if i deal with nonascii strings, i have a 'list of bytes' and need an encoding to interpret this list and transform it to a meaningful unicode string. Decoding does the opposite.
Whenever i 'cross the border' of my program, i have to encode the 'list of bytes' to an unicode string or decode the unicode string to a 'list of bytes' which is meaningful to the world outside. So encode early, decode lately means, to do it as near to the border as possible and to encode/decode i need a coding system, for example 'utf8' That means, there should be an encoding/decoding possibility to every interface i can use: files, stdin, stdout, stderr, gui (should be the most important ones). While trying to understand this, i wrote the following program. Maybe someone can give me a hint, how to print correctly: ###################################################### #! python # -*- coding: utf-8 -*- class EncTest: def __init__(self,Name=None): self.Name=unicode(Name, encoding='utf8') def __repr__(self): return u'My name is %s' % self.Name if __name__ == '__main__': a = EncTest('Müller') # this does work print a.__repr__() # throws an error if default encoding is ascii # but works if default encoding is utf8 print a # throws an error because a is not a string print unicode(a, encoding='utf8') ###################################################### Wolfgang -- http://mail.python.org/mailman/listinfo/python-list