Thanks to everyone for their excellent suggestions. I was able to acheive the following results with all your suggestions. However, I am unable to cross file size of 6 million rows. I would appreciate any helpful suggestions on avoiding memory errors. None of the solutions posted was able to cross this limit.
>>> Data size 999999 Elapsed 31.60352213 >>> ================================ RESTART ================================ >>> Data size 1999999 Elapsed 63.4050884573 >>> ================================ RESTART ================================ >>> Data size 4999999 Elapsed 177.888915777 > Data size 5999999' Traceback (most recent call last): File "C:/Documents/some.py", line 27, in <module> read_test() File "C:/Documents/some.py", line 21, in read_test data = array(data, dtype = float) MemoryError Robert Kern wrote: > Travis E. Oliphant wrote: > > > If you use numpy.fromfile, you need to skip past the initial header row > > yourself. Something like this: > > > > fid = open('somename.csv') > > # I think you also meant to include this line: > header = fid.readline() > > > data = numpy.fromfile(fid, sep=',').reshape(-1,6) > > # for 6-column data. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list