[issue26464] str.translate() unexpectedly duplicates characters

STINNER Victor Tue, 01 Mar 2016 11:54:45 -0800

STINNER Victor added the comment:

Oh... I see. It's a bug introduced by the optimization for ASCII replacing one 
character with another ASCII character or deleting a character: 
unicode_fast_translate(). See change cca6b056236a of issue #21118.


There is a confusion in the code between input and ouput position. "i = 
writer.pos;" is used in the caller to continue when unicode_fast_translate() 
was interrupted (because a translation use a non-ASCII character or a string 
longer than 1 character), but writer.pos is the position in the *output* 
string, not in the *input* string :-/

I see that I added unit tests on translate, but it lacks an unit testing fast 
translation, starting with ignore and then switching to regular translation.

Attached patch should fix the issue. It adds unit tests.

----------
keywords: +patch
Added file: http://bugs.python.org/file42056/unicode_fast_translate.patch

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue26464>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue26464] str.translate() unexpectedly duplicates characters

Reply via email to