[issue25709] Problem with string concatenation and utf-8 cache.

2016-02-12 Thread Ned Deily
Changes by Ned Deily : -- stage: patch review -> resolved ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https

[issue25709] Problem with string concatenation and utf-8 cache.

2016-02-12 Thread Georg Brandl
Georg Brandl added the comment: Don't bother. I can do that once 3.3.7 is released. -- ___ Python tracker ___ ___ Python-bugs-list mai

[issue25709] Problem with string concatenation and utf-8 cache.

2016-02-12 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: b9c8f1c80f47 added a new head. Should we merge 3.3 -> 3.4 -> 3.5 -> default? -- ___ Python tracker ___ ___

[issue25709] Problem with string concatenation and utf-8 cache.

2016-02-11 Thread Georg Brandl
Georg Brandl added the comment: Actually I prefer Greg to Gerg, so it's only half bad. :D -- ___ Python tracker ___ ___ Python-bugs-li

[issue25709] Problem with string concatenation and utf-8 cache.

2016-02-11 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > I wait only Greg's approving for 3.3. If I'll not get it in a day, I'll > commit the patch for 3.4+. Maybe it was my fault. I made a mistake in Georg's name. -- ___ Python tracker

[issue25709] Problem with string concatenation and utf-8 cache.

2016-02-11 Thread STINNER Victor
STINNER Victor added the comment: 2016-02-11 18:23 GMT+01:00 Georg Brandl : > > Georg Brandl added the comment: > > Backpicked to 3.3. Sorry for the wait. Good, this bugfix is useful :-) -- ___ Python tracker

[issue25709] Problem with string concatenation and utf-8 cache.

2016-02-11 Thread Georg Brandl
Georg Brandl added the comment: Backpicked to 3.3. Sorry for the wait. -- resolution: -> fixed status: open -> closed ___ Python tracker ___

[issue25709] Problem with string concatenation and utf-8 cache.

2016-02-11 Thread Roundup Robot
Roundup Robot added the comment: New changeset b9c8f1c80f47 by Serhiy Storchaka in branch '3.3': Issue #25709: Fixed problem with in-place string concatenation and utf-8 cache. https://hg.python.org/cpython/rev/b9c8f1c80f47 -- ___ Python tracker

[issue25709] Problem with string concatenation and utf-8 cache.

2016-02-11 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- assignee: serhiy.storchaka -> georg.brandl ___ Python tracker ___ ___ Python-bugs-list mailing list U

[issue25709] Problem with string concatenation and utf-8 cache.

2015-12-06 Thread Roundup Robot
Roundup Robot added the comment: New changeset 376b100107ba by Serhiy Storchaka in branch '3.5': Issue #25709: Fixed problem with in-place string concatenation and utf-8 cache. https://hg.python.org/cpython/rev/376b100107ba -- ___ Python tracker

[issue25709] Problem with string concatenation and utf-8 cache.

2015-12-05 Thread Larry Hastings
Larry Hastings added the comment: I cherry-picked this for 3.5.1. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscr

[issue25709] Problem with string concatenation and utf-8 cache.

2015-12-03 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- versions: -Python 3.4, Python 3.5, Python 3.6 ___ Python tracker ___ ___ Python-bugs-list mailing li

[issue25709] Problem with string concatenation and utf-8 cache.

2015-12-02 Thread STINNER Victor
STINNER Victor added the comment: > New changeset 67718032badb by Serhiy Storchaka in branch '3.4': Thanks. -- ___ Python tracker ___ ___

[issue25709] Problem with string concatenation and utf-8 cache.

2015-12-02 Thread Roundup Robot
Roundup Robot added the comment: New changeset 67718032badb by Serhiy Storchaka in branch '3.4': Issue #25709: Fixed problem with in-place string concatenation and utf-8 cache. https://hg.python.org/cpython/rev/67718032badb New changeset a0e2376768dc by Serhiy Storchaka in branch '3.5': Issue #2

[issue25709] Problem with string concatenation and utf-8 cache.

2015-12-02 Thread STINNER Victor
STINNER Victor added the comment: Please commit right now to 3.4+. Backport to 3.3 can be done later. -- ___ Python tracker ___ ___ Py

[issue25709] Problem with string concatenation and utf-8 cache.

2015-12-02 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I wait only Greg's approving for 3.3. If I'll not get it in a day, I'll commit the patch for 3.4+. -- ___ Python tracker ___

[issue25709] Problem with string concatenation and utf-8 cache.

2015-12-02 Thread Larry Hastings
Larry Hastings added the comment: Is this going in soon? I want to cherry-pick this for 3.5.1, which I tag in about 80 hours. -- ___ Python tracker ___

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Georg, I ask for applying this fix to 3.3. -- nosy: +georg.brandl versions: +Python 3.3 ___ Python tracker ___ ___

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-24 Thread STINNER Victor
STINNER Victor added the comment: issue25709_4.patch now looks good to me, but I added some minor comments on the review. -- ___ Python tracker ___ _

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > Why do strings cache their UTF-8 encoding? Mainly for compatibility with existing C API. Common way to parse function arguments in implemented in C function is to use special argument parsing API: PyArg_ParseTuple, PyArg_ParseTupleAndKeywords, or PyArg_Par

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 24.11.2015 02:30, Steven D'Aprano wrote: > > Steven D'Aprano added the comment: > > On Mon, Nov 23, 2015 at 09:48:46PM +, STINNER Victor wrote: > >> * the string has a cached UTF-8 byte string (ex: int(s) was called before >> the resize) > > Why d

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Fixed yet one bug (thanks Victor again). Test is improved, now it doesn't rely on implementation detail of particular builtin. -- Added file: http://bugs.python.org/file41146/issue25709_4.patch ___ Python tracker

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread Eryk Sun
Eryk Sun added the comment: > Why do strings cache their UTF-8 encoding? Strings also cache the wide-string representation. For example: from ctypes import * s = '\241\242\243' pythonapi.PyUnicode_AsUnicodeAndSize(py_object(s), None) pythonapi.PyUnicode_AsUTF8AndSize(py_object(s

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread STINNER Victor
STINNER Victor added the comment: Steven D'Aprano added the comment: > the problem with caches is that you run the risk of the cache being out of date. Since strings are immutable, it's not a big deal. We control where strings are modified (unicodeobject.c). --

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread Steven D'Aprano
Steven D'Aprano added the comment: On Mon, Nov 23, 2015 at 09:48:46PM +, STINNER Victor wrote: > * the string has a cached UTF-8 byte string (ex: int(s) was called before the > resize) Why do strings cache their UTF-8 encoding? I presume that some of Python's internals rely on the UTF-8 e

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: In updated patch fixed a bug found by Victor and addressed other his comments. Many thanks Victor! -- Added file: http://bugs.python.org/file41142/issue25709_3.patch ___ Python tracker

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread STINNER Victor
STINNER Victor added the comment: " and wow this is badly broken " I mean the currently code is badly broken. The bug is that sometimes, when a string is resized (which doesn't make sense, strings are immutable, right? :-D), the cached UTF-8 string can become corrupted (old pointer not update

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread Larry Hastings
Larry Hastings added the comment: I read some comments here and on the patches. Serhiy's patch adds some code and Victor says you can't call that macro on this object and wow this is badly broken. Can someone explain in simpler terms what's so broken, exactly? -- ___

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread random832
random832 added the comment: > unicode_modifiable in Objects/unicodeobject.c should return 0 if there's > cached PyUnicode_UTF8 data. In this case PyUnicode_Append won't operate in > place but instead concatenate a new string. Shouldn't it still operate in place but clear it? Operating in plac

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread Antoine Pitrou
Antoine Pitrou added the comment: 3.3 is presumably in security mode. Anyone using it would have had to live with the bug for a long time already. -- ___ Python tracker ___

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread STINNER Victor
STINNER Victor added the comment: I reviewed issue25709_2.patch. > It would be good to get this in 3.4.4. Since it's a major bug in the Unicode implementation, it may be worth to fix it in Python 3.3. The bug was introduced in Python 3.3 by the PEP 393. -- ___

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread Eryk Sun
Eryk Sun added the comment: Serhiy, when does sharing UTF-8 data occur in a compact object? It has to be ASCII since non-ASCII UTF-8 isn't sharable, but PyASCIIObject doesn't have the utf8 field. So it has to be a PyCompactUnicodeObject. But isn't ASCII always allocated as a PyASCIIObject? I n

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Added test without using pickle. -- nosy: +larry priority: high -> release blocker Added file: http://bugs.python.org/file41141/issue25709_2.patch ___ Python tracker

[issue25709] Problem with string concatenation and utf-8 cache.

2015-11-23 Thread Terry J. Reedy
Terry J. Reedy added the comment: It would be good to get this in 3.4.4. -- components: +Library (Lib) -IDLE nosy: +benjamin.peterson, ezio.melotti, haypo, lemburg, pitrou -kbk, roger.serwy title: greek alphabet bug it is very disturbing... -> Problem with string concatenation and utf-8