Marc-Andre Lemburg <m...@egenix.com> added the comment: Amaury Forgeot d'Arc wrote: > > Amaury Forgeot d'Arc <amaur...@gmail.com> added the comment: > >> we should make sure that it's not possible to load an extension >> compiled with 3.1 in 3.2 to prevent segfaults and buffer overruns. > > This is the case with this patch: today all these functions > (_PyUnicode_IsAlpha, _PyUnicode_ToLowercase) are actually #defines to > _PyUnicodeUCS2_* or _PyUnicodeUCS4_*. > The patch removes the #defines: 3.1 modules that call > _PyUnicodeUCS4_IsAlpha wouldn't load into a 3.2 interpreter.
True, but we can do better. For narrow builds, the API currently exposes the UCS2 APIs. We'd need to expose the UCS4 APIs *in addition* to those APIs and have the UCS2 APIs redirect to the UCS4 ones. For wide builds, we don't need to change anything. >> The change affects the Unicode type database which is implemented >> in unicodectype.c, not the Unicode database, which already uses UCS4. > > Are you referring to the _PyUnicode_TypeRecord structure? > The first three fields only contains values up to 65535, so they could > use "unsigned short" even for UCS4 builds. I haven't checked, but it's certainly possible to have a code point use a non-BMP lower/upper/title case mapping, so this should be made possible as well, if we're going to make changes to the type database. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue5127> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com