John> How naive (in the sense that compiler people use the term) is the John> current Python system? For example:
John> def foo() : John> s = "This is a test" John> return(s) John> s2 = foo() John> How many times does the string get copied? Never. s and s2 just refer to the same object (strings are immutable). Executing this: def foo() : print id("This is a test") s = "This is a test" print id(s) return(s) s2 = foo() print id(s2) prints the same value three times. John> Or, for example: John> s1 = "Test1" John> s2 = "Test2" John> s3 = "Test3" John> s = s1 + s2 + s3 John> Any redundant copies performed, or is that case optimized? Not optimized. You can see that using the dis module: 4 0 LOAD_CONST 1 ('Test1') 3 STORE_FAST 0 (s1) 5 6 LOAD_CONST 2 ('Test2') 9 STORE_FAST 1 (s2) 6 12 LOAD_CONST 3 ('Test3') 15 STORE_FAST 2 (s3) 7 18 LOAD_FAST 0 (s1) 21 LOAD_FAST 1 (s2) 24 BINARY_ADD 25 LOAD_FAST 2 (s3) 28 BINARY_ADD 29 STORE_FAST 3 (s) 32 LOAD_CONST 0 (None) 35 RETURN_VALUE The BINARY_ADD opcode creates a new string. John> How about this? John> kcount = 1000 John> s = '' John> for i in range(kcount) : John> s += str(i) + ' ' John> Is this O(N) or O(N^2) because of recopying of "s"? O(N). Here's a demonstration of that: #!/usr/bin/env python from __future__ import division def foo(kcount): s = '' for i in xrange(kcount) : s += str(i) + ' ' import time for i in xrange(5,25): t = time.time() foo(2**i) t = time.time() - t print 2**i, t, t/2**i On my laptop t roughly doubles for each iteration and prints around 5e-06 for t/2**i in all cases. Skip -- http://mail.python.org/mailman/listinfo/python-list