[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-11 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- stage: patch review -> committed/rejected status: open -> closed ___ Python tracker ___ ___ Python-bu

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-11 Thread Roundup Robot
Roundup Robot added the comment: New changeset b11507395ce4 by Serhiy Storchaka in branch '3.3': Add tests for issue #18183. http://hg.python.org/cpython/rev/b11507395ce4 New changeset 17c9f1627baf by Serhiy Storchaka in branch 'default': Add tests for issue #18183. http://hg.python.org/cpython/

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The result is trivial. Is not checking the result distract an attention from the main issue? -- ___ Python tracker ___ __

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread STINNER Victor
STINNER Victor added the comment: +'\U0001\U0010'.lower() Why not checking the result of these calls? -- ___ Python tracker ___ _

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here are additional tests for this issue. -- keywords: +patch stage: needs patch -> patch review status: closed -> open Added file: http://bugs.python.org/file30533/test_issue18183.patch ___ Python tracker

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread STINNER Victor
STINNER Victor added the comment: Oops, my MAX_MAXCHAR macro was too optimized :-) (the result is incorrect) It shows us that the test suite does not have enough test on non-BMP characters. -- ___ Python tracker _

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Benjamin Peterson
Benjamin Peterson added the comment: I simply removed the MAX_MAXCHAR micro-optimization, since it seems fairly unsafe. Interested parties can restore it safely if they wish. -- nosy: +benjamin.peterson resolution: -> fixed status: open -> closed __

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Roundup Robot
Roundup Robot added the comment: New changeset 89b106d298a9 by Benjamin Peterson in branch '3.3': remove MAX_MAXCHAR because it's unsafe for computing maximum codepoitn value (see #18183) http://hg.python.org/cpython/rev/89b106d298a9 New changeset 668aba845fb2 by Benjamin Peterson in branch 'de

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: >>> a = chr(0x84b2e)+chr(0x109710) >>> a.lower() SystemError: invalid maximum character passed to PyUnicode_New The MAX_MAXCHAR() macro only works for 'maxchar' values, like 0xff, 0x... in do_upper_or_lower() it's used with arbitrary UCS4 values. --

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: It happens due to use fast MAX_MAXCHAR() which can produce maxchar out of range (0x1 | 0x10 > MAX_UNICODE). -- ___ Python tracker ___

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- assignee: -> serhiy.storchaka ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Minimal example: >>> '\U0001\U0010'.lower() Traceback (most recent call last): File "", line 1, in SystemError: invalid maximum character passed to PyUnicode_New -- ___ Python tracker

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- components: +Interpreter Core nosy: +serhiy.storchaka stage: -> needs patch versions: +Python 3.4 ___ Python tracker ___ ___

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Dave Challis
New submission from Dave Challis: This occurred when attempting to decode invalid UTF-8 bytes using "errors='replace'", then attempting to lowercase the produced unicode string. This was also tested in python 2.7, but it doesn't occur there. Code to reproduce: x = b'\xe2\xb3\x99\xb3\xd1\x9f\