Hello, I am working on an article on python string formatting. As a part of the article I am researching the different forms of python string formatting.
While researching string interpolation(i.e. the % operator) I noticed something weird with string lengths. Given two following two functions: def simple_interpolation_constant_short_string(): return "Hello %s" % "World!" def simple_interpolation_constant_long_string(): return "Hello %s. I am a very long string used for research" % "World!" Lets look at the bytecode generated by them using the dis module The first example produces the following bytecode: 9 0 LOAD_CONST 3 ('Hello World!') 2 RETURN_VALUE It seems very normal, it appears that the python compiler optimizes the constant and removes the need for the string interpolation However the output of the second function caught my eye: 12 0 LOAD_CONST 1 ('Hello %s. I am a very long string used for research') 2 LOAD_CONST 2 ('World!') 4 BINARY_MODULO 6 RETURN_VALUE This was not optimized by the compiler! Normal string interpolation was used! Based on some more testing it appears that for strings that would result in more than 20 characters no optimization is done, as evident by these examples: def expected_result(): return "abcdefghijklmnopqrs%s" % "t" Bytecode: 15 0 LOAD_CONST 3 ('abcdefghijklmnopqrst') 2 RETURN_VALUE def abnormal_result(): return "abcdefghijklmnopqrst%s" % "u" Bytecode: 18 0 LOAD_CONST 1 ('abcdefghijklmnopqrst%s') 2 LOAD_CONST 2 ('u') 4 BINARY_MODULO 6 RETURN_VALUE I am using Python 3.6.3 I am curios as to why this happens. Can anyone shed further light on this behaviour? -- https://mail.python.org/mailman/listinfo/python-list