Steven D'Aprano <[EMAIL PROTECTED]> wrote: > On Mon, 07 May 2007 14:41:02 -0700, Nick Vatamaniuc wrote: > > > Rohit, > > > > Consider using an SQLite database. It comes with Python 2.5 and higher. > > SQLite will do a nice job keeping track of the index. You can easily > > find the line you need with a SQL query and your can write to it as > > well. When you have a file and you write to one line of the file, all of > > the rest of the lines will have to be shifted to accommodate, the > > potentially larger new line. > > > Using an database for tracking line number and byte position -- isn't > that a bit overkill? > > I would have thought something as simple as a list of line lengths would > do: > > offsets = [35, # first line is 35 bytes long > 19, # second line is 19 bytes long... > 45, 12, 108, 67] > > > To get to the nth line, you have to seek to byte position: > > sum(offsets[:n])
...and then you STILL can't write there (without reading and rewriting all the succeeding part of the file) unless the line you're writing is always the same length as the one you're overwriting, which doesn't seem to be part of the constraints in the OP's original application. I'm with Nick in recommending SQlite for the purpose -- it _IS_ quite "lite", as its name suggests. BSD-DB (a DB that's much more complicated to use, being far lower-level, but by the same token affords you extremely fine-grained control of operations) might be an alternative IF, after first having coded the application with SQLite, you can indeed prove, profiler in hand, that it's a serious bottleneck. However, premature optimization is the root of all evil in programming. Alex -- http://mail.python.org/mailman/listinfo/python-list