poostenr added the comment: Eric, Steven, thank you for your feedback so far.
I am using Windows7, Intel i7. That one particular file of 6.5MB took ~1 minute on my machine. When I ran that same test on Linux with Python 3.5.1, it took about 3 seconds. I was amazed to see a 20x difference. Steven suggested the idea that this phenomenon might be specific to Windows. And I agree, that is what it is looking like. Or is Python doing something in the background? The Python script is straight forward with a loop that reads a line from a CSV file, split the column values and saves each value as '<value>' to another file. Basically building an SQL statement. I have had no issues until I added the encapsulating single quotes around the value. Because I can reproduce this performance difference at will by alternating which line I comment out, leads me to believe it cannot be HDD, AV or something outside the python script interfering. I repeated the simplified test, that I ran earlier on a Linux system, but this time on my Windows system. I don't see anything spectacular. I am just puzzled that using one statement or the other causes such a huge performance impact somehow. I will try some more tests and copy your examples. import time loopcount = 10000000 # Using string value s="test 1" v="test 1" start_ms = int(round(time.time() * 1000)) for x in range (loopcount): y = "{0}".format(v) end_ms = int(round(time.time() * 1000)) print("Start {0}: {1}".format(s,start_ms)) print("End {0}: {1}".format(s,end_ms)) print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms)) # Start test 1: 1452828394523 # End test 1: 1452828397957 # Diff test 1: 3434 ms s="test 2" v="test 2" start_ms = int(round(time.time() * 1000)) for x in range (loopcount): y = "'%s'" % (v) end_ms = int(round(time.time() * 1000)) print("Start {0}: {1}".format(s,start_ms)) print("End {0}: {1}".format(s,end_ms)) print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms)) # Start test 2: 1452828397957 # End test 2: 1452828401233 # Diff test 2: 3276 ms s="test 3" v="test 3" start_ms = int(round(time.time() * 1000)) for x in range (loopcount): y = "'{0}'".format(v) end_ms = int(round(time.time() * 1000)) print("Start {0}: {1}".format(s,start_ms)) print("End {0}: {1}".format(s,end_ms)) print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms)) # Start test 3: 1452828401233 # End test 3: 1452828406320 # Diff test 3: 5087 ms # Using integer value s="test 4" v=123456 start_ms = int(round(time.time() * 1000)) for x in range (loopcount): y = "{0}".format(v) end_ms = int(round(time.time() * 1000)) print("Start {0}: {1}".format(s,start_ms)) print("End {0}: {1}".format(s,end_ms)) print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms)) # Start test 4: 1452828406320 # End test 4: 1452828411378 # Diff test 4: 5058 ms s="test 5" v=123456 start_ms = int(round(time.time() * 1000)) for x in range (loopcount): y = "'%s'" % (v) end_ms = int(round(time.time() * 1000)) print("Start {0}: {1}".format(s,start_ms)) print("End {0}: {1}".format(s,end_ms)) print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms)) # Start test 5: 1452828411378 # End test 5: 1452828415264 # Diff test 5: 3886 ms s="test 6" v=123456 start_ms = int(round(time.time() * 1000)) for x in range (loopcount): y = "'{0}'".format(v) end_ms = int(round(time.time() * 1000)) print("Start {0}: {1}".format(s,start_ms)) print("End {0}: {1}".format(s,end_ms)) print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms)) # Start test 6: 1452828415264 # End test 6: 1452828421292 # Diff test 6: 6028 ms ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26118> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com