One of the posters inspired me to do profiling on my newbie script (pasted below). After measurements I have found that the speed of Python, at least in the area where my script works, is surprisingly high.
This is the experiment: a script recreates the folder hierarchy somewhere else and stores there the compressed versions of files from source hierarchy (the script is doing additional backups of the disk of file server at the company where I work onto other disks, with compression for sake of saving space). The data was: 468 MB, 15057 files, 1568 folders (machine: win2k, python v2.3.3) The time that WinRAR v3.20 (with ZIP format and normal compression set) needed to compress all that was 119 seconds. The Python script time (running under profiler) was, drumroll... 198 seconds. Note that the Python script had to laboriously recreate the tree of 1568 folders and create over 15 thousand compressed files, so it had more work to do actually than WinRAR did. The size of compressed data was basically the same, about 207 MB. I find it very encouraging that in the real world area of application a newbie script written in the very high-level language can have the performance that is not that far from the performance of "shrinkwrap" pro archiver (WinRAR is excellent archiver, both when it comes to compression as well as speed). I do realize that this is mainly the result of all the "underlying infrastructure" of Python. Great work, guys. Congrats. The only thing I'm missing in this picture is knowledge if my script could be further optimised (not that I actually need better performance, I'm just curious what possible solutions could be). Any takers among the experienced guys? Profiling results: >>> p3.sort_stats('cumulative').print_stats(40) Fri Dec 31 01:04:14 2004 p3.tmp 580543 function calls (568607 primitive calls) in 198.124 CPU seconds Ordered by: cumulative time List reduced from 69 to 40 due to restriction <40> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.013 0.013 198.124 198.124 profile:0(z3()) 1 0.000 0.000 198.110 198.110 <string>:1(?) 1 0.000 0.000 198.110 198.110 <interactive input>:1(z3) 1 1.513 1.513 198.110 198.110 zmtree3.py:26(zmtree) 15057 14.504 0.001 186.961 0.012 zmtree3.py:7(zf) 15057 147.582 0.010 148.778 0.010 C:\Python23\lib\zipfile.py:388(write) 15057 12.156 0.001 12.156 0.001 C:\Python23\lib\zipfile.py:182(__init__) 32002 7.957 0.000 8.542 0.000 C:\PYTHON23\Lib\ntpath.py:266(isdir) 13826/1890 2.550 0.000 8.143 0.004 C:\Python23\lib\os.py:206(walk) 30114 3.164 0.000 3.164 0.000 C:\Python23\lib\zipfile.py:483(close) 60228 1.753 0.000 2.149 0.000 C:\PYTHON23\Lib\ntpath.py:157(split) 45171 0.538 0.000 2.116 0.000 C:\PYTHON23\Lib\ntpath.py:197(basename) 15057 1.285 0.000 1.917 0.000 C:\PYTHON23\Lib\ntpath.py:467(abspath) 33890 0.688 0.000 1.419 0.000 C:\PYTHON23\Lib\ntpath.py:58(join) 109175 0.783 0.000 0.783 0.000 C:\PYTHON23\Lib\ntpath.py:115(splitdrive) 15057 0.196 0.000 0.768 0.000 C:\PYTHON23\Lib\ntpath.py:204(dirname) 33890 0.433 0.000 0.731 0.000 C:\PYTHON23\Lib\ntpath.py:50(isabs) 15057 0.544 0.000 0.632 0.000 C:\PYTHON23\Lib\ntpath.py:438(normpath) 32002 0.431 0.000 0.585 0.000 C:\PYTHON23\Lib\stat.py:45(S_ISDIR) 15057 0.555 0.000 0.555 0.000 C:\Python23\lib\zipfile.py:149(FileHeader) 15057 0.483 0.000 0.483 0.000 C:\Python23\lib\zipfile.py:116(__init__) 151 0.002 0.000 0.435 0.003 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\framework\winout.py:171(write) 151 0.002 0.000 0.432 0.003 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\framework\winout.py:489(write) 151 0.013 0.000 0.430 0.003 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\framework\winout.py:461(HandleOutput) 76 0.087 0.001 0.405 0.005 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\framework\winout.py:430(QueueFlush) 15057 0.239 0.000 0.340 0.000 C:\Python23\lib\zipfile.py:479(__del__) 15057 0.157 0.000 0.157 0.000 C:\Python23\lib\zipfile.py:371(_writecheck) 32002 0.154 0.000 0.154 0.000 C:\PYTHON23\Lib\stat.py:29(S_IFMT) 76 0.007 0.000 0.146 0.002 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\framework\winout.py:262(dowrite) 76 0.007 0.000 0.137 0.002 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\scintilla\formatter.py:221(OnStyleNeeded) 76 0.011 0.000 0.118 0.002 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\framework\interact.py:197(Colorize) 76 0.110 0.001 0.112 0.001 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\scintilla\control.py:69(SCIInsertText) 76 0.079 0.001 0.081 0.001 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\scintilla\control.py:333(GetTextRange) 76 0.018 0.000 0.020 0.000 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\scintilla\control.py:296(SetSel) 76 0.006 0.000 0.018 0.000 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\scintilla\document.py:149(__call__) 227 0.003 0.000 0.012 0.000 C:\Python23\lib\Queue.py:172(get_nowait) 76 0.007 0.000 0.011 0.000 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\framework\interact.py:114(ColorizeInteractiveCode) 532 0.011 0.000 0.011 0.000 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\scintilla\control.py:330(GetTextLength) 76 0.001 0.000 0.010 0.000 C:\PYTHON23\lib\site-packages\Pythonwin\pywin\scintilla\view.py:256(OnBraceMatch) 1888 0.009 0.000 0.009 0.000 C:\PYTHON23\Lib\ntpath.py:245(islink) --- Script: #!/usr/bin/python import os import sys from zipfile import ZipFile, ZIP_DEFLATED def zf(sfpath, targetdir): if (sys.platform[:3] == 'win'): tgfpath=sfpath[2:] else: tgfpath=sfpath zfdir=os.path.dirname(os.path.abspath(targetdir) + tgfpath) zfpath=zfdir + os.path.sep + os.path.basename(tgfpath) + '.zip' if(not os.path.isdir(zfdir)): os.makedirs(zfdir) archive=ZipFile(zfpath, 'w', ZIP_DEFLATED) sfile=open(sfpath,'rb') zfname=os.path.basename(tgfpath) archive.write(sfpath, os.path.basename(zfpath), ZIP_DEFLATED) archive.close() ssize=os.stat(sfpath).st_size zsize=os.stat(zfpath).st_size return (ssize,zsize) def zmtree(sdir,tdir): n=0 ssize=0 zsize=0 sys.stdout.write('\n ') for root, dirs, files in os.walk(sdir): for file in files: res=zf(os.path.join(root,file),tdir) ssize+=res[0] zsize+=res[1] n=n+1 #sys.stdout.write('.') if (n % 200 == 0): print " %.2fM (%.2fM)" % (ssize/1048576.0, zsize/1048576.0) #sys.stdout.write(' ') return (n, ssize, zsize) if __name__=="__main__": if len(sys.argv) == 3: if(os.path.isdir(sys.argv[1]) and os.path.isdir(sys.argv[2])): (n,ssize,zsize)=zmtree(os.path.abspath(sys.argv[1]),os.path.abspath(sys.argv[2])) print "\n\n Summary:\n Number of files compressed: %d\n Total size of original files: %.2fM\n \ Total size of compressed files: %.2fM" % (n, ssize/1048576.0, zsize/1048576.0) sys.exit(0) else: print "Incorrect arguments." if (not os.path.isdir(sys.argv[1])): print sys.argv[1] + " is not directory." if (not os.path.isdir(sys.argv[2])): print sys.argv[2] + " is not directory." print "\n Usage:\n " + sys.argv[0] + " source-directory target-directory" -- It's a man's life in a Python Programming Association. -- http://mail.python.org/mailman/listinfo/python-list