On Wednesday, February 5, 2014 11:15:09 AM UTC+5:30, Rustom Mody wrote: > On Wednesday, February 5, 2014 11:05:05 AM UTC+5:30, Ayushi Dalmia wrote: > > > This also doesn't gives the true size. I did the following: > > > > > import sys > > > data=[] > > > f=open('stopWords.txt','r') > > > > > for line in f: > > > line=line.split() > > > data.extend(line) > > > > > print sys.getsizeof(data) > > > > > where stopWords.txt is a file of size 4KB > > > > Try getsizeof("".join(data)) > > > > General advice: > > - You have been recommended (by Chris??) that you should use a database > > - You say you cant use a database (for whatever reason) > > > > Now the fact is you NEED database (functionality) > > How to escape this catch-22 situation? > > In computer science its called somewhat sardonically "Greenspun's 10th rule" > > > > And the best way out is to > > > > 1 isolate those aspects of database functionality you need > > 2 temporarily forget about your original problem and implement the dbms > > (subset of) DBMS functionality you need > > 3 Use 2 above to implement 1
Hello Rustum, Thanks for the enlightenment. I did not know about the Greenspun's Tenth rule. It is interesting to know that. However, it is an academic project and not a research one. Hence I donot have the liberty to choose what to work with. Life is easier with databases though, but I am not allowed to use them. Thanks for the tip. I will try to replicate those functionality. -- https://mail.python.org/mailman/listinfo/python-list