On Mon, 15 Jan 2007 03:25:01 +0000, John Nagle wrote: > How naive (in the sense that compiler people use the term) > is the current Python system? For example: > > def foo() : > s = "This is a test" > return(s) > > s2 = foo() > > How many times does the string get copied?
Let's find out. Results below for Python 2.3 -- other versions may vary. >>> def foo(): ... s = "This is a test" ... return s ... >>> def foo2(): ... return "This is a test" ... >>> import dis >>> dis.dis(foo) 2 0 LOAD_CONST 1 ('This is a test') 3 STORE_FAST 0 (s) 3 6 LOAD_FAST 0 (s) 9 RETURN_VALUE 10 LOAD_CONST 0 (None) 13 RETURN_VALUE >>> dis.dis(foo2) 2 0 LOAD_CONST 1 ('This is a test') 3 RETURN_VALUE 4 LOAD_CONST 0 (None) 7 RETURN_VALUE foo and foo2 functions compile to different byte-code. foo does a little more work than foo2, but it is likely to be a trivial difference. >>> s1 = foo() >>> s2 = foo() >>> s1 == s2, s1 is s2 (True, True) So the string "This is a test" within foo is not copied each time the function is called. However, the string "This is a test" is duplicated between foo and foo2 (the two functions don't share the same string instance): >>> s3 = foo2() >>> s3 == s1, s3 is s1 (True, False) > Or, for example: > > s1 = "Test1" > s2 = "Test2" > s3 = "Test3" > s = s1 + s2 + s3 > > Any redundant copies performed, or is that case optimized? I don't believe it is optimized. I believe that in Python 2.5 simple numeric optimizations are performed, so that "x = 1 + 3" would compile to "x = 4", but I don't think that holds for strings. If you are running 2.5, you could find out with dis.dis. > How about this? > > kcount = 1000 > s = '' > for i in range(kcount) : > s += str(i) + ' ' > > Is this O(N) or O(N^2) because of recopying of "s"? That will be O(N**2), except that CPython version 2.4 (or is it 2.5?) can, sometimes, optimize it. Note that this is an implementation detail, and doesn't hold for other Pythons like Jython, IronPython or PyPy. -- Steven D'Aprano -- http://mail.python.org/mailman/listinfo/python-list