Mark Dickinson wrote: > I have a simple 192-line Python script that begins with the line: > > dummy0 = 47 > > The script runs in less than 2.5 seconds. The variable dummy0 is never > referenced again, directly or indirectly, by the rest of the script. > > Here's the surprise: if I remove or comment out this first line, the > script takes more than 15 seconds to run. So it appears that adding a > redundant line produces a spectacular six-fold increase in speed!
Thanks for your post. It is cute, brilliant, and interesting. I haven't had time to investigate but can point at the likely cause. All of the global names are stored in a single hash table. Search time is dictated by two factors, the sparseness of the hash table and the distribution of hash values. With respect to sparseness, whenever that table becomes 2/3 full, it grows by a factor of four and becomes only 1/6 full (leading to many fewer collisions). With respect to distribution, it should be noted that string hash values are decidely non-random and your variable names likely congested consecutive spaces in a nearly full table (resulting in seven times as many search probes to find a global value). When the extra value was added, it likely resized the table four-fold and redistributed the hash values into fewer consecutive positions. Raymond P.S. To analyze it further, start with something like this: >>> len(set(hash('dummy%d' %i) & 31 for i in xrange(29))) 26 >>> len(set(hash('dummy%d' %i) & 127 for i in xrange(29))) 29 -- http://mail.python.org/mailman/listinfo/python-list