Hi all,
the other day I 2to3'ed some code and found it ran much slower in 3.3.0
than 2.7.2. I fixed the problem but in the process of trying to diagnose
it I've stumbled upon something weird that I hope someone here can
explain to me. In what follows I'm using Python 2.7.2 on 64-bit Windows
7. Suppose I do this:
from timeit import timeit
# find out how the time taken to append a character to the end of a byte
# string depends on the size of the string
results = []
for size in range(0, 10000001, 100000):
results.append(timeit("y = x + 'a'",
setup = "x = 'a' * %i" % size, number = 1))
If I plot results against size, what I see is that the time taken
increases approximately linearly with the size of the string, with the
string of length 10000000 taking about 4 milliseconds. On the other
hand, if I replace the statement to be timed with "x = x + 'a'" instead
of "y = x + 'a'", the time taken seems to be pretty much independent of
size, apart from a few spikes; the string of length 10000000 takes about
4 microseconds.
I get similar results with strings (but not bytes) in 3.3.0. My guess is
that this is some kind of optimisation that treats strings as mutable
when carrying out operations that result in the original string being
discarded. If so it's jolly clever, since it knows when there are other
references to the same string:
timeit("x = x + 'a'", setup = "x = y = 'a' * %i" % size, number = 1)
# grows linearly with size
timeit("x = x + 'a'", setup = "x, y = 'a' * %i", 'a' * %i"
% (size, size), number = 1)
# stays approximately constant
It also can see through some attempts to fool it:
timeit("x = ('' + x) + 'a'", setup = "x = 'a' * %i" % size, number = 1)
# stays approximately constant
timeit("x = x*1 + 'a'", setup = "x = 'a' * %i" % size, number = 1)
# stays approximately constant
Is my guess correct? If not, what is going on? If so, is it possible to
explain to a programming noob how the interpreter does this? And is
there a reason why it doesn't work with bytes in 3.3?
--
I have made a thing that superficially resembles music:
http://soundcloud.com/eroneity/we-berated-our-own-crapiness
--
http://mail.python.org/mailman/listinfo/python-list