On Nov 10, 4:47 pm, [EMAIL PROTECTED] wrote: > Hello Everyone, > > I need to read a .csv file which has a size of 2.26 GB . And I wrote a > Python script , where I need to read this file. And my Computer has 2 > GB RAM Please see the code as follows: > > """ > This program has been developed to retrieve all the promoter sequences > for the specified > list of genes in the given cluster > > So, this program will act as a substitute to the whole EZRetrieve > system > > Input arguments: > > 1) Cluster.txt or DowRatClust161718bwithDummy.txt > 2) TransProCrossReferenceAndSequences.csv -> This is the file that has > all the promoter sequences > 3) -2000 > 4) 500 > """ > > import time > import csv > import sys > import linecache > import re > from sets import Set > import gc > > print time.localtime() > > fileInputHandler = open(sys.argv[1],"r") > line = fileInputHandler.readline() > > refSeqIDsinTransPro = [] > promoterSequencesinTransPro = [] > reader2 = csv.reader(open(sys.argv[2],"rb")) > reader2_list = [] > reader2_list.extend(reader2) > > for data2 in reader2_list: > refSeqIDsinTransPro.append(data2[3]) > for data2 in reader2_list: > promoterSequencesinTransPro.append(data2[4]) > > while line: > l = line.rstrip('\n') > for j in range(1,len(refSeqIDsinTransPro)): > found = re.search(l,refSeqIDsinTransPro[j]) > if found: > """promoterSequencesinTransPro[j] """ > print l > > line = fileInputHandler.readline() > > fileInputHandler.close() > > The error that I got is given as follows: > Traceback (most recent call last): > File "RefSeqsToPromoterSequences.py", line 31, in <module> > reader2_list.extend(reader2) > MemoryError > > I understand that the issue is Memory error and it is caused because > of the line reader2_list.extend(reader2). Is there any other > alternative method in reading the .csv file line by line? > > sincerely, > Suprabhath
Thanks a Lot James Mills. It worked -- http://mail.python.org/mailman/listinfo/python-list