poostenr added the comment:

Eric, Steven, thank you for your feedback so far.

I am using Windows7, Intel i7.
That one particular file of 6.5MB took ~1 minute on my machine.
When I ran that same test on Linux with Python 3.5.1, it took about 3 seconds. 
I was amazed to see a 20x difference.

Steven suggested the idea that this phenomenon might be specific to Windows. 
And I agree, that is what it is looking like. Or is Python doing something in 
the background?

The Python script is straight forward with a loop that reads a line from a CSV 
file, split the column values and saves each value as '<value>' to another 
file. Basically building an SQL statement.
I have had no issues until I added the encapsulating single quotes around the 
value.

Because I can reproduce this performance difference at will by alternating 
which line I comment out, leads me to believe it cannot be HDD, AV or something 
outside the python script interfering.

I repeated the simplified test, that I ran earlier on a Linux system, but this 
time on my Windows system.
I don't see anything spectacular.
I am just puzzled that using one statement or the other causes such a huge 
performance impact somehow.

I will try some more tests and copy your examples.

import time
loopcount = 10000000

# Using string value
s="test 1"
v="test 1"
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
    y = "{0}".format(v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End   {0}: {1}".format(s,end_ms))
print("Diff  {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 1: 1452828394523
# End   test 1: 1452828397957
# Diff  test 1: 3434 ms


s="test 2"
v="test 2"
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
    y = "'%s'" % (v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End   {0}: {1}".format(s,end_ms))
print("Diff  {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 2: 1452828397957
# End   test 2: 1452828401233
# Diff  test 2: 3276 ms


s="test 3"
v="test 3"
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
    y = "'{0}'".format(v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End   {0}: {1}".format(s,end_ms))
print("Diff  {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 3: 1452828401233
# End   test 3: 1452828406320
# Diff  test 3: 5087 ms

# Using integer value
s="test 4"
v=123456
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
    y = "{0}".format(v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End   {0}: {1}".format(s,end_ms))
print("Diff  {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 4: 1452828406320
# End   test 4: 1452828411378
# Diff  test 4: 5058 ms


s="test 5"
v=123456
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
    y = "'%s'" % (v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End   {0}: {1}".format(s,end_ms))
print("Diff  {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 5: 1452828411378
# End   test 5: 1452828415264
# Diff  test 5: 3886 ms

s="test 6"
v=123456
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
    y = "'{0}'".format(v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End   {0}: {1}".format(s,end_ms))
print("Diff  {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 6: 1452828415264
# End   test 6: 1452828421292
# Diff  test 6: 6028 ms

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26118>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to