Hi all I have got a text file which is only 32 MB in size and consists of the following type of lines (columns are fixed):
== Header text 1 line ... 01-Jan-2006 0055 145.069 -16.0449 83.2246 84.2835 499.14680 0.074029965 01-Jan-2006 0065 15.069 -1.0449 83.2246 84.2835 499.14680 12.074029965 ... 12-Dec-2006 1255 145.069 23.0449 3.2246 4.2835 49.140 0.74029965 ... == I have 3 questions: 1. Why is my translation (read_slow) of the IDL code so damn slow (IDL: 13 sec, Python:2min16sec). Although both IDL and Python consume about 40 MB. 2. Why is my faster version (read_fast) (13sec) so memory hungry (it takes 200MB)? 2.1 Why is my second fastest version (read_second_fast) (16sec) still memory hungry? 3. What do I need to do to get the speed of IDL and the memory footprint of IDL (in that case 40MB)? #convdate converts the date in the first column (e.g. 12-Dec-2006) into day of year #convtime does something else == import fileinput import numpy as np import datetime import time from StringIO import StringIO def read_slow(file): count=max(enumerate(open(file)))[0] erg=np.zeros((count,10),dtype=np.float64) convdate= lambda x: time.strptime(x,"%d-%b-%Y").tm_yday convtime= lambda x: np.int(np.float64(x)*1.0e-1) i=0 with open(file) as infile: #read first header line infile.readline() for line in infile: tmp=np.genfromtxt(StringIO(line),\ dtype=np.float64,\ converters={0:convdate, 1:convtime}) #not sure if it does the right thing here: erg[i,:]=tmp i=i+1 infile.close() return erg == def read_fast(file): convdate= lambda x: time.strptime(x,"%d-%b-%Y").tm_yday convtime= lambda x: np.int(np.float64(x)*1.0e-1) with open(file) as infile: erg=np.genfromtxt(infile, autostrip=True,skip_header=1,\ dtype=np.float64,\ converters={0:convdate,1:convtime}) infile.close() return erg == == def read_second_fast(file): convdate= lambda x: time.strptime(x,"%d-%b-%Y").tm_yday convtime= lambda x: np.int(np.float64(x)*1.0e-1) erg=np.loadtxt(file,skiprows=1,\ dtype=np.float64,\ converters={0:convdate,1:convtime}) return erg == Thanks for all the help. By the way: I colleague told me my code is 1. poorly written and more or less unreadable and unmaintainable because of the use of lambda. I am just learning but is his observation true? -- http://mail.python.org/mailman/listinfo/python-list