[issue1359] py3k: out of bounds read in PyUnicode_DecodeUnicodeEscape

Amaury Forgeot d'Arc Mon, 29 Oct 2007 13:26:49 -0800

New submission from Amaury Forgeot d'Arc:

A correction for the problem found by GvR in change 58692:


> There's one mystery: if I remove ob_sstate from the PyStringObject struct,
> some (unicode) string literals are mutilated, e.g. ('\\1', '\1') prints
> ('\\1', '\t').  This must be an out of bounds write or something that I
> can't track down.  (It doesn't help that it doesn't occur in debug mode.
> And no, make clean + recompilation doesn't help either.)
> 
> So, in the mean time, I just keep the field, renamed to 'ob_placeholder'.

I think I found the problem. It reproduces on Windows, with a slightly
different input
    >>> ('\\2','\1')
    ('\\2', '\n')
(the win32 release build is of the kind "optimized with debug info", so
using the debugger is possible)

The problem is in unicodeobject.c::PyUnicode_DecodeUnicodeEscape:
- the input buffer is not null-terminated
- when decoding octal escape, we increment s without checking if it is
still in the limits.
In my case, the "\1" was followed by a "2" in memory, hence the bogus
chr(0o12) == '\n'.

Also corrected a potential problem when the string ends with a \:
PyUnicode_DecodeUnicodeEscape("\\t", 1) should return an error.

----------
components: Interpreter Core
files: unicodeEscape.diff
messages: 56933
nosy: amaury.forgeotdarc, gvanrossum
severity: normal
status: open
title: py3k: out of bounds read in PyUnicode_DecodeUnicodeEscape
versions: Python 3.0
Added file: http://bugs.python.org/file8658/unicodeEscape.diff

__________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1359>
__________________________________

unicodeEscape.diff
Description: Binary data

_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue1359] py3k: out of bounds read in PyUnicode_DecodeUnicodeEscape

Reply via email to