anuraguni...@yahoo.com <anuraguni...@yahoo.com> wrote: > First of all thanks everybody for putting time with my confusing post > and I apologize for not being clear after so many efforts. > > here is my last try (you are free to ignore my request for free > advice) > > # -*- coding: utf-8 -*- > > class A(object): > > def __unicode__(self): > return u"©au" > > def __repr__(self): > return unicode(self).encode("utf-8") > > __str__ = __repr__ > > a = A() > u1 = unicode(a) > u2 = unicode([a]) > > now I am not using print so that doesn't matter stdout can print > unicode or not > my naive question is line u2 = unicode([a]) throws > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position > 1: ordinal not in range(128) > > shouldn't list class call unicode on its elements?
You mean when you call unicode(a_list) it should unicode() on each of the elements to build the resultq? Yes that does seem sensible, however list doesn't have a __unicode__ method at all so I guess it is falling back to using __str__ on each element, and which explains your problem exactly. If you try your example on python 3 then you don't need the __unicode__ method at all (all strings are unicode) and you won't have the problem I predict. (I haven't got a python 3 in front of me at the moment to test.) So I doubt you'll find the momentum to fix this since unicode and str integration was the main focus of python 3, but you could report a bug. If you attach a patch to fix it - so much the better! Here is my demonstration of the problem with python 2.5.2 >> class A(object): ... def __unicode__(self): ... return u"\N{COPYRIGHT SIGN}au" ... def __repr__(self): ... return unicode(self).encode("utf-8") ... __str__ = __repr__ ... >>> a = A() >>> str(a) '\xc2\xa9au' >>> repr(a) '\xc2\xa9au' >>> unicode(a) u'\xa9au' >>> L=[a] >>> str(L) '[\xc2\xa9au]' >>> repr(L) '[\xc2\xa9au]' >>> unicode(L) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1: ordinal not in range(128) >>> unicode('[\xc2\xa9au]') Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1: ordinal not in range(128) >>> L.__unicode__ Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'list' object has no attribute '__unicode__' >>> unicode(str(L),"utf-8") u'[\xa9au]' -- Nick Craig-Wood <n...@craig-wood.com> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list