On Thu, 12 May 2016 07:36 pm, Ned Batchelder wrote: > The CPython optimization depends on the string having only a single > reference. A seemingly unrelated change to the code can change the > performance significantly: > > In [1]: %%timeit > ...: s = "" > ...: for x in xrange(100000): > ...: s = s + str(x) > ...: > 10 loops, best of 3: 33.5 ms per loop > > In [2]: %%timeit > ...: s = t = "" > ...: for x in xrange(100000): > ...: s = t = s + str(x) > ...: > 1 loop, best of 3: 1.57 s per loop
Nice demonstration! But it is actually even worse than that. The optimization depends on memory allocation details which means that some CPython interpreters cannot use it, depending on the operating system and version. Consequently, reliance on it can and has lead to embarrassments like this performance bug which only affected *some* Windows users. In 2009, Chris Withers asked for help debugging a problem where Python httplib was hundreds of times slower than other tools, like wget and Internet Explorer: https://mail.python.org/pipermail/python-dev/2009-August/091125.html A few weeks later, Simon Cross realised the problem was probably the quadratic behaviour of repeated string addition: https://mail.python.org/pipermail/python-dev/2009-September/091582.html leading to this quote from Antoine Pitrou: "Given differences between platforms in realloc() performance, it might be the reason why it goes unnoticed under Linux but degenerates under Windows." https://mail.python.org/pipermail/python-dev/2009-September/091583.html and Guido's comment: "Also agreed that this is an embarrassment." https://mail.python.org/pipermail/python-dev/2009-September/091592.html So beware of relying on the CPython string concatenation optimization in production code! Here's the tracker issue that added the optimization in the first place: http://bugs.python.org/issue980695 The feature was controversial at the time (and remains slightly so): https://mail.python.org/pipermail/python-dev/2004-August/046686.html My opinion is that it is great for interactive use at the Python prompt, but I would never use it in code I cared about. -- Steven -- https://mail.python.org/mailman/listinfo/python-list