New submission from Amaury Forgeot d'Arc: A correction for the problem found by GvR in change 58692:
> There's one mystery: if I remove ob_sstate from the PyStringObject struct, > some (unicode) string literals are mutilated, e.g. ('\\1', '\1') prints > ('\\1', '\t'). This must be an out of bounds write or something that I > can't track down. (It doesn't help that it doesn't occur in debug mode. > And no, make clean + recompilation doesn't help either.) > > So, in the mean time, I just keep the field, renamed to 'ob_placeholder'. I think I found the problem. It reproduces on Windows, with a slightly different input >>> ('\\2','\1') ('\\2', '\n') (the win32 release build is of the kind "optimized with debug info", so using the debugger is possible) The problem is in unicodeobject.c::PyUnicode_DecodeUnicodeEscape: - the input buffer is not null-terminated - when decoding octal escape, we increment s without checking if it is still in the limits. In my case, the "\1" was followed by a "2" in memory, hence the bogus chr(0o12) == '\n'. Also corrected a potential problem when the string ends with a \: PyUnicode_DecodeUnicodeEscape("\\t", 1) should return an error. ---------- components: Interpreter Core files: unicodeEscape.diff messages: 56933 nosy: amaury.forgeotdarc, gvanrossum severity: normal status: open title: py3k: out of bounds read in PyUnicode_DecodeUnicodeEscape versions: Python 3.0 Added file: http://bugs.python.org/file8658/unicodeEscape.diff __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1359> __________________________________
unicodeEscape.diff
Description: Binary data
_______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com