anuraguni...@yahoo.com wrote:

so unicode(obj) calls __unicode__ on that object

It will look for the existence of type(ob).__unicode__ ...

> and if it isn't there __repr__ is used

According to the below, type(ob).__str__ is tried first.

__repr__ of list by default return a str even if __repr__ of element
is unicode

From the fine library manual, built-in functions section:
(I reccommend using it, along with interactive experiments.)

"repr( object)
Return a string ..."

"str( [object])
Return a string ..."

"unicode( [object[, encoding [, errors]]])

Return the Unicode string version of object using one of the following modes:

If encoding and/or errors are given, ...

If no optional parameters are given, unicode() will mimic the behaviour of str() except that it returns Unicode strings instead of 8-bit strings. More precisely, if object is a Unicode string or subclass it will return that Unicode string without any additional decoding applied.

For objects which provide a __unicode__() method, it will call this method without arguments to create a Unicode string. For all other objects, the 8-bit string version or representation is requested and then converted to a Unicode string using the codec for the default encoding in 'strict' mode.
"

'unicode(somelist)' has no optional parameters, so skip to third paragraph. Somelist is not a unicode instance, so skip to the last paragraph. If you do dir(list) I presume you will *not* see '__unicode__' listed. So skip to the last sentence.
unicode(somelist) == str(somelist).decode(default,'strict').

I do not believe str() and repr() are specifically documented for builtin classes other than the general description, but you can figure that str(collection) or repr(collection) will call str or repr on the members of the collection in order to return a str, as the doc says. (Details are available by experiment.) Str(uni_string) encodes with the default encoding, which seems to be 'ascii' in 2.x. I am sure it uses 'strict' errors.

I would agree that str(some_unicode) could be better documented, like unicode(some_str) is.

so my only solution looks like to use my own list class everywhere i
use list
class mylist(list):
    def __unicode__(self):
        return u"["+u''.join(map(unicode,self))+u"]"

Or write a function and use that instead, or, if and when you can, switch to 3.x where str and repr accept and produce unicode.

tjr

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to