[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

Ezio Melotti Thu, 24 Feb 2011 13:20:11 -0800

Ezio Melotti <ezio.melo...@gmail.com> added the comment:

Probably not, but that part should be changed if possible, because is less 
efficient than the previous version that was allocating only 11 bytes.


The problem here is that the previous versions was only changing/removing 
chars, whereas this might add spaces too, so the string might get longer. E.g. 
'utf8' -> 'utf 8'. The worst case is 'a1a1a1' -> 'a 1 a 1 a 1', and including 
the trailing \0, the result might end up being twice as long than the original 
encoding string. It can be fixed returning 0 as soon as the normalized string 
reaches a fixed threshold (something like 15 chars, depending on the longest 
normalized encoding name).

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue11303>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

Reply via email to