[issue15379] Charmap decoding of no-BMP characters

2012-11-17 Thread Antoine Pitrou
Changes by Antoine Pitrou : -- stage: commit review -> committed/rejected ___ Python tracker ___ ___ Python-bugs-list mailing list Uns

[issue15379] Charmap decoding of no-BMP characters

2012-11-17 Thread Antoine Pitrou
Antoine Pitrou added the comment: Thanks for the backport, committed! -- status: open -> closed ___ Python tracker ___ ___ Python-bugs

[issue15379] Charmap decoding of no-BMP characters

2012-11-17 Thread Roundup Robot
Roundup Robot added the comment: New changeset c7ce91756472 by Antoine Pitrou in branch '2.7': Issue #15379: Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings). http://hg.python.org/cpython/rev/c7ce91756472 --

[issue15379] Charmap decoding of no-BMP characters

2012-10-24 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- stage: committed/rejected -> commit review versions: -Python 3.2, Python 3.3 ___ Python tracker ___

[issue15379] Charmap decoding of no-BMP characters

2012-10-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The 2.7 patch is just a backport of 3.2 patch (including the last Antoine's fix). Please look and commit. -- ___ Python tracker ___

[issue15379] Charmap decoding of no-BMP characters

2012-10-02 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- versions: +Python 2.7 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://m

[issue15379] Charmap decoding of no-BMP characters

2012-10-02 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: We forgot about 2.7 (because I had not thought to apply it even for a 3.2). Here is backported patch. -- status: closed -> open Added file: http://bugs.python.org/file27390/decode_charmap_maxchar-2.7.patch ___ Pyth

[issue15379] Charmap decoding of no-BMP characters

2012-09-23 Thread Antoine Pitrou
Antoine Pitrou added the comment: Thank you, I've committed the patches. There was a test failure in test_codeccallbacks in 3.2, which I fixed simply by replacing sys.maxunicode with a hardcoded 0x11. -- resolution: -> fixed status: open -> closed

[issue15379] Charmap decoding of no-BMP characters

2012-09-23 Thread Roundup Robot
Roundup Robot added the comment: New changeset 620d23f7ad41 by Antoine Pitrou in branch '3.2': Issue #15379: Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings). http://hg.python.org/cpython/rev/620d23f7ad41 New changeset c64dec45d46f by An

[issue15379] Charmap decoding of no-BMP characters

2012-09-23 Thread Antoine Pitrou
Changes by Antoine Pitrou : -- stage: patch review -> committed/rejected ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue15379] Charmap decoding of no-BMP characters

2012-09-21 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : Removed file: http://bugs.python.org/file26428/decode_charmap_maxchar-3.2.patch ___ Python tracker ___ ___ Pytho

[issue15379] Charmap decoding of no-BMP characters

2012-09-21 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : Removed file: http://bugs.python.org/file26412/decode_charmap_maxchar.patch ___ Python tracker ___ ___ Python-bu

[issue15379] Charmap decoding of no-BMP characters

2012-09-21 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : Removed file: http://bugs.python.org/file26416/decode_charmap_tests.patch ___ Python tracker ___ ___ Python-bugs

[issue15379] Charmap decoding of no-BMP characters

2012-09-21 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Patches updated. Added a few new tests, used MAX_UNICODE, a little changed extrachars grow step. -- Added file: http://bugs.python.org/file27249/decode_charmap_maxchar-3.3_2.patch Added file: http://bugs.python.org/file27250/decode_charmap_maxchar-3.2

[issue15379] Charmap decoding of no-BMP characters

2012-09-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Ping. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python

[issue15379] Charmap decoding of no-BMP characters

2012-08-05 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- stage: needs patch -> patch review ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscri

[issue15379] Charmap decoding of no-BMP characters

2012-08-05 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- keywords: +needs review priority: normal -> low stage: patch review -> needs patch ___ Python tracker ___ ___

[issue15379] Charmap decoding of no-BMP characters

2012-07-18 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Ah, I was worried by the possible quadratic behavior. So the other (existing) case is quadratic as well (I was mislead by the <<, which made me think there is something clever there). That's good enough for 3.2, I guess. -- ___

[issue15379] Charmap decoding of no-BMP characters

2012-07-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: It's the same strategy. "needed = (targetsize - extrachars) + (targetsize << 2)". targetsize == 2. -- ___ Python tracker ___

[issue15379] Charmap decoding of no-BMP characters

2012-07-18 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: About the patch for 3.2: "needed = 6 - extrachars" Where does this 6 come from? There is another part which uses this "extrachars". Why is the allocation strategy different here? -- ___ Python tracker

[issue15379] Charmap decoding of no-BMP characters

2012-07-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Well, here is a patch for 3.2. -- versions: +Python 3.2 Added file: http://bugs.python.org/file26428/decode_charmap_maxchar-3.2.patch ___ Python tracker

[issue15379] Charmap decoding of no-BMP characters

2012-07-18 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: In 3.2, narrow build is also broken when the "charmap" is a string: >>> codecs.charmap_decode(b'\0', 'strict', '\U0002000B') returns ('𠀋', 1) with a wide unicode build, but ('\ud840', 1) with a narrow build. 3.2 could be fixed to allow characters up to s

[issue15379] Charmap decoding of no-BMP characters

2012-07-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Fixing for 3.2 and lesser is possible, but expensive, because of narrow build limitation. If necessary, I will give the patch, but it is easier to mark it as "wont fix" for pre-3.3 versions. Here is a tests for charmap decoding. Tests added not only for thi

[issue15379] Charmap decoding of no-BMP characters

2012-07-17 Thread Antoine Pitrou
Antoine Pitrou added the comment: Could you add a test to your patch? Is the issue 3.3-specific? -- nosy: +benjamin.peterson, ezio.melotti, haypo, lemburg, pitrou stage: -> patch review ___ Python tracker ___

[issue15379] Charmap decoding of no-BMP characters

2012-07-17 Thread Serhiy Storchaka
New submission from Serhiy Storchaka : Yet one inconsistency in charmap codec. >>> import codecs >>> codecs.charmap_decode(b'\x00', 'strict', '\U0002000B') ('𠀋', 1) >>> codecs.charmap_decode(b'\x00', 'strict', {0: '\U0002000B'}) ('𠀋', 1) >>> codecs.charmap_decode(b'\x00', 'strict', {0: 0x2000B})