> > header line > > 9 nonblank lines with alphanumeric data > > header line > > 9 nonblank lines with alphanumeric data > > ... > > ... > > ... > > header line > > 9 nonblank lines with alphanumeric data > > EOF > > > > where, a data set contains 10 lines (header + 9 nonblank) and there can > > be several thousand > > data sets in a single file. In addition,*each header has a* *unique ID > > code*.
> Alternatively, you could scan the file, recording the ID and the file > offset in a dict so that, given an ID, you can seek directly to that > file position. If you can grep for the header lines you can retrieve the headers and the line number for seeking. grep is (probably) faster than python so I would have it be 2 steps. 1. grep > temp.txt 2. python; check if ID is in temp.txt and then processes Ramit Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology 712 Main Street | Houston, TX 77002 work phone: 713 - 216 - 5423 -- This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email. -- http://mail.python.org/mailman/listinfo/python-list