New submission from Miro Hrončok <m...@hroncok.cz>:

I was just bit by specifying an nonexisitng error handler for bytes.decode() 
without noticing.

Consider this code:

>>> 'a'.encode('cp1250').decode('utf-8', errors='Boom, Shaka Laka, Boom!')
'a'

Nobody notices that the error handler doesn't exist.

However:

>>> 'ž'.encode('cp1250').decode('utf-8', errors='Boom, Shaka Laka, Boom!')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: unknown error handler name 'Boom, Shaka Laka, Boom!'


The error is only noticeable once there is an error in the data.

While nobody could possibly mistake 'Boom, Shaka Laka, Boom!' for a valid error 
handler, I was bit by this:

>>> b.decode('utf-8', errors='surrogate')

Which in fact should have been

>>> b.decode('utf-8', errors='surrogateescape')

Yet I wasn't notified, because the bytes in question were actually decodeable 
as valid utf-8.

I suggest that unknown error handler should rise an exception immediately like 
this:

>>> 'b'.encode('cp1250').decode('utf-8', errors='Boom, Shaka Laka, Boom!')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: unknown error handler name 'Boom, Shaka Laka, Boom!'

----------
components: Unicode
messages: 346407
nosy: ezio.melotti, hroncok, vstinner
priority: normal
severity: normal
status: open
title: unknown error handlers should be reported early
versions: Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37388>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to