Hi all,

the other day I 2to3'ed some code and found it ran much slower in 3.3.0 than 2.7.2. I fixed the problem but in the process of trying to diagnose it I've stumbled upon something weird that I hope someone here can explain to me. In what follows I'm using Python 2.7.2 on 64-bit Windows 7. Suppose I do this:

from timeit import timeit

# find out how the time taken to append a character to the end of a byte
# string depends on the size of the string

results = []
for size in range(0, 10000001, 100000):
    results.append(timeit("y = x + 'a'",
    setup = "x = 'a' * %i" % size, number = 1))

If I plot results against size, what I see is that the time taken increases approximately linearly with the size of the string, with the string of length 10000000 taking about 4 milliseconds. On the other hand, if I replace the statement to be timed with "x = x + 'a'" instead of "y = x + 'a'", the time taken seems to be pretty much independent of size, apart from a few spikes; the string of length 10000000 takes about 4 microseconds.

I get similar results with strings (but not bytes) in 3.3.0. My guess is that this is some kind of optimisation that treats strings as mutable when carrying out operations that result in the original string being discarded. If so it's jolly clever, since it knows when there are other references to the same string:

timeit("x = x + 'a'", setup = "x = y = 'a' * %i" % size, number = 1)
    # grows linearly with size

timeit("x = x + 'a'", setup = "x, y = 'a' * %i", 'a' * %i"
% (size, size), number = 1)
    # stays approximately constant

It also can see through some attempts to fool it:

timeit("x = ('' + x) + 'a'", setup = "x = 'a' * %i" % size, number = 1)
    # stays approximately constant

timeit("x = x*1 + 'a'", setup = "x = 'a' * %i" % size, number = 1)
    # stays approximately constant

Is my guess correct? If not, what is going on? If so, is it possible to explain to a programming noob how the interpreter does this? And is there a reason why it doesn't work with bytes in 3.3?


--
I have made a thing that superficially resembles music:

http://soundcloud.com/eroneity/we-berated-our-own-crapiness
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to