Holger Joukl wrote: > Hi there, > > I consider the behaviour of unicode() inconvenient wrt to conversion of > non-string > arguments. > While you can do: > > >>> unicode(17.3) > u'17.3' > > you cannot do: > > >>> unicode(17.3, 'ISO-8859-1', 'replace') > Traceback (most recent call last): > File "<stdin>", line 1, in ? > TypeError: coercing to Unicode: need string or buffer, float found > >>> > > This is somehow annoying when you want to convert a mixed-type argument > list > to unicode strings, e.g. for a logging system (that's where it bit me) and > want to make sure that possible raw string arguments are also converted to > unicode without errors (although by force). > Especially as this is a performance-critical part in my application so I > really > do not like to wrap unicode() into some custom tounicode() function that > handles > such cases by distinction of argument types. > > Any reason why unicode() with a non-string argument should not allow the > encoding and errors arguments?
There is reason: encoding is a property of bytes, it is not applicable to other objects. > Or some good solution to work around my problem? Do not put undecoded bytes in a mixed-type argument list. A rule of thumb working with unicode: decode as soon as possible, encode as late as possible. -- Leo -- http://mail.python.org/mailman/listinfo/python-list