On 1/16/2014 9:16 AM, Steven D'Aprano wrote:
On Thu, 16 Jan 2014 13:34:08 +0100, Ernest Adrogué wrote:
Hi,
There seems to be some inconsistency in the way exceptions handle
Unicode strings.
Yes. I believe the problem lies in the __str__ method. For example,
KeyError manages to handle Unicode, although in an ugly way:
py> str(KeyError(u'ä'))
"u'\\xe4'"
Hence:
py> raise KeyError(u'ä')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: u'\xe4'
While ValueError assumes ASCII and fails:
py> str(ValueError(u'ä'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in
position 0: ordinal not in range(128)
When displaying the traceback, the error is suppressed, hence:
py> raise ValueError(u'ä')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError
I believe this might be accepted as a bug report on ValueError.
Or a change might be rejected as a feature change or as a bugfix that
might break existing code. We do change exception messages in new
versions but do not normally do so in bugfix releases.
http://bugs.python.org/issue1012952 is related but different. The issue
there was that unicode(ValueError(u'ä')) gave the same
UnicodeEncodeError as str(ValueError(u'ä')). That was fixed by giving
exceptions a __unicode__ method, but that did not fix the traceback
display issue above.
http://bugs.python.org/issue6108
unicode(exception) and str(exception) should return the same message
also seems related. The issue was raised what str should do if the
unicode message had non-ascii chars. I did not read enough to find an
answer. The same question would arise here.
--
Terry Jan Reedy
--
https://mail.python.org/mailman/listinfo/python-list