20.01.18 10:32, Steven D'Aprano пише:
I want an error handler that falls back on Latin-1 for anything which
cannot be decoded. Is this the right way to write it?
def latin1_fallback(exception):
assert isinstance(exception, UnicodeError)
start, end = exception.start, exception.end
obj = exception.object
if isinstance(exception, UnicodeDecodeError):
return (obj[start:end].decode('latin1'), end+1)
elif isinstance(exception, UnicodeEncodeError):
return (obj[start:end].encode('latin1'), end+1)
else:
raise
Just `end` instead of `end+1`.
And it is safer to use `bytes.decode(obj[start:end], 'latin1')` or
`str(obj[start:end], 'latin1')` instead of
`obj[start:end].decode('latin1')`. Just for the case if obj has
overridden decode() method.
Otherwise LGTM.
--
https://mail.python.org/mailman/listinfo/python-list