>>> I could make it that simple, but that is also incredibly slow and on >>> a file with several million lines, it takes somewhere in the league of >>> half an hour to grab all the data. I need this to grab data from >>> many many file and return the data quickly. >>> >>> Brandon L. Harris >>> >> That's surprising. >> >> I just made a file with 13 million lines of your data (447Mb) and >> read it with my code. It took a little over 36 seconds. There must be >> something different in your set up or the real data you've got. >> >> Cheers, >> >> Drea >> > Could it be that there isn't just that type of data in the file? there > are many different types, that is just one that I'm trying to grab. > > Brandon L. Harris
I don't see why it would make such a difference. If your data looks like... <block header> \t<attribute> \t<attribute> \t<attribute> Just change this line... if line.startswith("createNode"): to... if not line.startswith("\t"): and it won't care what sort of data the file contains. Processing that data after you've collected it will still take a while, but that's the same whichever method you use to read it. Cheers, Drea p.s. Just noticed I hadn't pre-declared the currentBlock list. -- http://mail.python.org/mailman/listinfo/python-list