Ned Deily added the comment:
See http://docs.python.org/2/library/functions.html#unicode. It appears to me
that unicode() is behaving exactly as documented. In particular:
"If encoding and/or errors are given, unicode() will decode the object which
can either be an 8-bit string or a character
New submission from G. Scott Johnston:
I've come up with the following series of minimal examples to demonstrate my
bug.
>>> unicode("")
u''
>>> unicode("", errors="ignore")
u''
>>> unicode("abcü")
Traceback (most recent call last):
File "", line 1, in
UnicodeDecodeError: 'ascii' codec c